Home > Net >  Concatencate list of Dataframes by index
Concatencate list of Dataframes by index

Time:09-16

There are a list of Dataframes, cannot figure solution to concatenate their columns by index.

df1 = pd.DataFrame(np.array([[5], [4], [24]]), columns=['A'], index=[1, 2, 3])

df2 = pd.DataFrame(np.array([[19], [16], [9]]), columns=['A'], index=[4, 5, 7])

df3 = pd.DataFrame(np.array([[49], [36], [12]]), columns=['B'], index=[1, 2, 3])

df4 = pd.DataFrame(np.array([[18], [23], [91]]), columns=['B'], index=[6, 7, 8])
dfs = [df1,df2,df3,df4]

dfs = pd.concat(dfs, axis=0)

dfs
 1  5.0     NaN
2   4.0     NaN
3   24.0    NaN
4   19.0    NaN
5   16.0    NaN
7   9.0     NaN
1   NaN     49.0
2   NaN     36.0
3   NaN     12.0
6   NaN     18.0
7   NaN     23.0
8   NaN     91.0

Is there any elegant and fast solution to combine their columns as written in solution below?

Solution

dfs
 1  5.0     49.0
2   4.0     36.0
3   24.0    12.0
4   19.0    NaN
5   16.0    NaN
6   NaN     18.0
7   9.0     23.0
8   NaN     91.0

Thanks for any help

CodePudding user response:

Since it's a two way merge (on both index and column), you can try combine_first along with the functools.reduce helper method:

from functools import reduce
reduce(lambda x, y: x.combine_first(y), dfs)

      A     B
1   5.0  49.0
2   4.0  36.0
3  24.0  12.0
4  19.0   NaN
5  16.0   NaN
6   NaN  18.0
7   9.0  23.0
8   NaN  91.0
  • Related