Home > front end >  Standard deviation of lists in pandas columns
Standard deviation of lists in pandas columns

Time:07-15

I use the following code:

df11 = df_curr_obj.apply(lambda x: [float(b) for a, b in (x.value_counts()/n_new).head(3).items()])

df22 = df_old_obj.apply(lambda x:[float(b) for a, b in (x.value_counts()/n_old).head(3).items()])

df_final = pd.concat([df22,df11], axis=1, keys=('df_old_obj','df_curr_obj'))

to get the following dataframe (cropped rows):

                  df_old_obj             df_curr_obj

_rev              [79.5, 0.25]           [92.0, 0.5]
team              [22.75, 10.25, 10.25]  [25.5, 17.0, 12.0]
entitytype        [0.25, 0.25, 0.25]     [0.5, 0.5, 0.5]
lie               [26.25, 1.25, 0.5]     [36.0, 1.5, 0.5]
presentation      [26.25, 1.5]           [36.0, 2.0]
fetalheartbeat    [79.25]                [91.5]
liquordescription [66.0, 1.75, 0.5]      [77.0, 2.5, 1.0]

Firstly the data type of both columns above shows as object, even though I have used float(b) to convert b.

Secondly, how do I get the standard deviation for each list, for example:

          df_old_obj                        df_curr_obj

_rev     show St.dev of [79.5, 0.25]        show St.dev of [92.0, 0.5]

and so on for all rows..

I know that in the case of wanting to find standard deviation for each column, I must do

df['column'].std() 

but my case is not as simple as this, Please help!

CodePudding user response:

The type of columns shows up as Object because you have lists in your cells and a list is indeed an object in Python.

You can easily compute the standard deviation of each cell with df_final.applymap(lambda x: np.std(x)).

  • Related