Home > Software design >  How multipe series into a dataframe with a series name per column in pandas
How multipe series into a dataframe with a series name per column in pandas

Time:11-08

I have a list of pd.Series with different date indexes and names as such:

trade_date
2007-01-03    0.049259
2007-01-04    0.047454
2007-01-05    0.057485
2007-01-08    0.059216
2007-01-09    0.055359
                ...   
2013-12-24    0.021048
2013-12-26    0.021671
2013-12-27    0.017898
2013-12-30    0.034071
2013-12-31    0.022301
Name: name1, Length: 1762, dtype: float64

I want to join this list of series into a DataFrame where each Name becomes a column in the DataFame and any missing indexes are set as nan.

When I try pd.concat(list_data) I just get one really big series instead. If i create an empty DataFrame and loop over each series in my list I get an error ValueError: cannot reindex from a duplicate axis How can I join these into a DataFame?

CodePudding user response:

Use:

pd.concat(map(lambda s: s.groupby(level=0).last(), list_data), axis=1)

older answer

You should use axis=1 in pandas.concat:

pd.concat([series1, series2, series3], axis=1)

example on your data (assuming s the provided series):

pd.concat([s, (s 1).rename('name2').iloc[5:]], axis=1)

output:

               name1     name2
trade_date                    
2007-01-03  0.049259       NaN
2007-01-04  0.047454       NaN
2007-01-05  0.057485       NaN
2007-01-08  0.059216       NaN
2007-01-09  0.055359       NaN
2013-12-24  0.021048  1.021048
2013-12-26  0.021671  1.021671
2013-12-27  0.017898  1.017898
2013-12-30  0.034071  1.034071
2013-12-31  0.022301  1.022301

CodePudding user response:

You need to concat along the columns (combine Series horizontally):

pd.concat(list_data, axis=1)

CodePudding user response:

You probably have multiple rows with the same date index per Series.

To debug this problem, you can do:

for sr in list_data:
    sr = sr[sr.index.value_counts() > 1]
    if len(sr):
        print(f'[sr.name]')
        print(sr, end='\n')

If there is an output, you can't use concat. Perhaps, you have to use merge with how='outer' as parameter.

  • Related