I have a list of dataframe, each with shape 1000x160.
df_list = [df1,df2]
All the dataframe have same columns except last three i.e
col1, col2, col3 ... col157, col158, col159, col160
columns from 1-157 are the same in all the data frames, same values same column names but columns 158, 159, 160 are different, with different names and different values.
I have tried pd.concat() with pretty much all the arguments, but it either results in a dataframe with shape 2000x160 or 1000x320.
The resulting dataframe should be 1000X166 (considering only two dataframes in the list of dataframes)
Can someone help? Also, I want to do it without a for loop
CodePudding user response:
In the example where your list is two dfs, it sounds like you'd want the final df to have 163 cols: 157 (2 * 3). It also sounds like a combination of reduce and merge could work -- previous answer here. Would be useful to provide an MRE and/or sample output.
Try something like:
import pandas as pd
from functools import reduce
df = df_list[0]
merge_cols = df.columns.tolist()[0:158]
reduce(lambda x, y: pd.merge(x, y, on = merge_cols), df_list)