I have a list of pd.Series
with different date indexes and names as such:
trade_date
2007-01-03 0.049259
2007-01-04 0.047454
2007-01-05 0.057485
2007-01-08 0.059216
2007-01-09 0.055359
...
2013-12-24 0.021048
2013-12-26 0.021671
2013-12-27 0.017898
2013-12-30 0.034071
2013-12-31 0.022301
Name: name1, Length: 1762, dtype: float64
I want to join this list of series into a DataFrame where each Name
becomes a column in the DataFame and any missing indexes are set as nan.
When I try pd.concat(list_data)
I just get one really big series instead. If i create an empty DataFrame and loop over each series in my list I get an error ValueError: cannot reindex from a duplicate axis
How can I join these into a DataFame?
CodePudding user response:
Use:
pd.concat(map(lambda s: s.groupby(level=0).last(), list_data), axis=1)
older answer
You should use axis=1
in pandas.concat
:
pd.concat([series1, series2, series3], axis=1)
example on your data (assuming s the provided series):
pd.concat([s, (s 1).rename('name2').iloc[5:]], axis=1)
output:
name1 name2
trade_date
2007-01-03 0.049259 NaN
2007-01-04 0.047454 NaN
2007-01-05 0.057485 NaN
2007-01-08 0.059216 NaN
2007-01-09 0.055359 NaN
2013-12-24 0.021048 1.021048
2013-12-26 0.021671 1.021671
2013-12-27 0.017898 1.017898
2013-12-30 0.034071 1.034071
2013-12-31 0.022301 1.022301
CodePudding user response:
You need to concat along the columns (combine Series horizontally):
pd.concat(list_data, axis=1)
CodePudding user response:
You probably have multiple rows with the same date index per Series
.
To debug this problem, you can do:
for sr in list_data:
sr = sr[sr.index.value_counts() > 1]
if len(sr):
print(f'[sr.name]')
print(sr, end='\n')
If there is an output, you can't use concat
. Perhaps, you have to use merge
with how='outer'
as parameter.