I have a dataframe that has multiple floats per cell. I need to calculate the mean of each of these cells and have the results put into a new dataframe.
How can I do that in python?
CodePudding user response:
General solution is elementwise mean
:
print (df)
Sub_1 Sub_2
0 [1,2,3] [4,5,3]
1 [1,7,3] [4,8,3]
If same values in each column is possible create 3d numpy array and then count mean
:
arr = np.mean(np.array(df.to_numpy().tolist()), axis=2)
df1 = pd.DataFrame(arr, columns=df.columns, index=df.index)
print (df1)
Sub_1 Sub_2
0 2.000000 4.0
1 3.666667 5.0
df1 = df.applymap(np.mean)
print (df1)
Sub_1 Sub_2
0 2.000000 4.0
1 3.666667 5.0
Or:
df1 = df.explode(['Sub_1','Sub_2']).groupby(level=0).mean()