There are a list of Dataframes, cannot figure solution to concatenate their columns by index.
df1 = pd.DataFrame(np.array([[5], [4], [24]]), columns=['A'], index=[1, 2, 3])
df2 = pd.DataFrame(np.array([[19], [16], [9]]), columns=['A'], index=[4, 5, 7])
df3 = pd.DataFrame(np.array([[49], [36], [12]]), columns=['B'], index=[1, 2, 3])
df4 = pd.DataFrame(np.array([[18], [23], [91]]), columns=['B'], index=[6, 7, 8])
dfs = [df1,df2,df3,df4]
dfs = pd.concat(dfs, axis=0)
dfs
1 5.0 NaN
2 4.0 NaN
3 24.0 NaN
4 19.0 NaN
5 16.0 NaN
7 9.0 NaN
1 NaN 49.0
2 NaN 36.0
3 NaN 12.0
6 NaN 18.0
7 NaN 23.0
8 NaN 91.0
Is there any elegant and fast solution to combine their columns as written in solution below?
Solution
dfs
1 5.0 49.0
2 4.0 36.0
3 24.0 12.0
4 19.0 NaN
5 16.0 NaN
6 NaN 18.0
7 9.0 23.0
8 NaN 91.0
Thanks for any help
CodePudding user response:
Since it's a two way merge (on both index and column), you can try combine_first
along with the functools.reduce
helper method:
from functools import reduce
reduce(lambda x, y: x.combine_first(y), dfs)
A B
1 5.0 49.0
2 4.0 36.0
3 24.0 12.0
4 19.0 NaN
5 16.0 NaN
6 NaN 18.0
7 9.0 23.0
8 NaN 91.0