input: list of dataframe
df1 = pd.DataFrame({'N': [1.2, 1.4, 3.3]}, index=[1, 2, 3])
df2 = pd.DataFrame({'N': [2.2, 1.8, 4.3]}, index=[1, 2, 4])
df3 = pd.DataFrame({'N': [2.5, 6.4, 4.9]}, index=[3, 5, 7])
df_list= []
for df in (df1,df2,df3):
df_list.append(df)
I have a duplicate index of [1,2,3], want an average of them in the output
output: dataframe with corresponding index
1 (1.2 2.2)/2
2 (1.4 1.8)/2
3 (3.3 2.5)/2
4 4.3
5 6.4
7 4.9
So how to groupby duplicate index in a list and output average into a dataframe. Directly concatenate dataframes is not an option for me.
CodePudding user response:
I would first concatenate all the data into a single DataFrame. Note that the values will automatically be aligned by index. Then you can get the means easily:
df1 = pd.DataFrame({'N': [1.2, 1.4, 3.3]}, index=[1, 2, 3])
df2 = pd.DataFrame({'N': [2.2, 1.8, 4.3]}, index=[1, 2, 4])
df3 = pd.DataFrame({'N': [2.5, 6.4, 4.9]}, index=[3, 5, 7])
df_list = [df1, df2, df3]
df = pd.concat(df_list, axis=1)
df.columns = ['N1', 'N2', 'N3']
print(df.mean(axis=1))
1 1.7
2 1.6
3 2.9
4 4.3
5 6.4
7 4.9
dtype: float64