How can I calculate the mean of a specific cell in a dataframe in Python?-CodePudding

Dataframe

I have a dataframe that has multiple floats per cell. I need to calculate the mean of each of these cells and have the results put into a new dataframe.

How can I do that in python?

CodePudding user response：

General solution is elementwise mean:

print (df)
     Sub_1    Sub_2
0  [1,2,3]  [4,5,3]
1  [1,7,3]  [4,8,3]

If same values in each column is possible create 3d numpy array and then count mean:

arr = np.mean(np.array(df.to_numpy().tolist()), axis=2)

df1 = pd.DataFrame(arr, columns=df.columns, index=df.index)
print (df1)
      Sub_1  Sub_2
0  2.000000    4.0
1  3.666667    5.0

df1 = df.applymap(np.mean)
print (df1)
      Sub_1  Sub_2
0  2.000000    4.0
1  3.666667    5.0

Or:

df1 = df.explode(['Sub_1','Sub_2']).groupby(level=0).mean()