Home > OS >  duplicate index in a list and calculate mean by index
duplicate index in a list and calculate mean by index

Time:09-27

input: list of dataframe

df1 = pd.DataFrame({'N': [1.2, 1.4, 3.3]}, index=[1, 2, 3])
df2 = pd.DataFrame({'N': [2.2, 1.8, 4.3]}, index=[1, 2, 4])
df3 = pd.DataFrame({'N': [2.5, 6.4, 4.9]}, index=[3, 5, 7])

df_list= []
   for df in (df1,df2,df3):
   df_list.append(df)

I have a duplicate index of [1,2,3], want an average of them in the output

output: dataframe with corresponding index

  1   (1.2 2.2)/2
  2   (1.4 1.8)/2
  3   (3.3 2.5)/2
  4    4.3
  5    6.4
  7    4.9

So how to groupby duplicate index in a list and output average into a dataframe. Directly concatenate dataframes is not an option for me.

CodePudding user response:

I would first concatenate all the data into a single DataFrame. Note that the values will automatically be aligned by index. Then you can get the means easily:

df1 = pd.DataFrame({'N': [1.2, 1.4, 3.3]}, index=[1, 2, 3])
df2 = pd.DataFrame({'N': [2.2, 1.8, 4.3]}, index=[1, 2, 4])
df3 = pd.DataFrame({'N': [2.5, 6.4, 4.9]}, index=[3, 5, 7])

df_list = [df1, df2, df3]

df = pd.concat(df_list, axis=1)
df.columns = ['N1', 'N2', 'N3']

print(df.mean(axis=1))
1    1.7
2    1.6
3    2.9
4    4.3
5    6.4
7    4.9
dtype: float64
  • Related