Home > Mobile >  Value error when calculating standard deviation on dataframe
Value error when calculating standard deviation on dataframe

Time:12-31

I need the standard deviation of column distance for each row in my dataframe.

Dataframe df_stats:

                                a1/a2   distance  mean_distance
date_time                                                      
2021-10-29T21:00:00 00:00  105.007574  -2.530492       0.234318
2021-10-29T22:00:00 00:00  104.459527  -5.232787       0.012537
2021-10-29T23:00:00 00:00  104.648467   1.807101       0.093432
2021-10-30T00:00:00 00:00  104.758201   1.048046       0.164502
2021-10-30T01:00:00 00:00  104.820132   0.591004       0.246095
2021-10-30T02:00:00 00:00  104.474062  -3.307024      -0.194917
2021-10-30T03:00:00 00:00  104.284146  -1.819483      -0.231843
2021-10-30T04:00:00 00:00  104.072383  -2.032697      -0.249918
2021-10-30T05:00:00 00:00  103.690546  -3.675699      -0.484996
2021-10-30T06:00:00 00:00  103.755979   0.630837      -0.823674
2021-10-30T07:00:00 00:00  102.721667 -10.018720      -1.181811
2021-10-30T08:00:00 00:00  102.998153   2.687995      -1.015365
2021-10-30T09:00:00 00:00  103.236159   2.308109      -0.876012
2021-10-30T10:00:00 00:00  103.471932   2.281216      -0.930593
2021-10-30T11:00:00 00:00  103.376928  -0.918579      -1.142659
2021-10-30T12:00:00 00:00  103.587805   2.037809      -1.110613
2021-10-30T13:00:00 00:00  104.359756   7.424508      -0.468987
2021-10-30T14:00:00 00:00  104.612493   2.418853      -0.383811
2021-10-30T15:00:00 00:00  104.607392  -0.048755      -0.562828
2021-10-30T16:00:00 00:00  104.846049   2.278849      -0.203495
2021-10-30T17:00:00 00:00  104.997437   1.442872      -0.004827

Trying to do it this way:

df_stats['std'] = df_stats.distance.std(axis=1)

But I get this error:

No axis named 1 for object type <class 'pandas.core.series.Series'>

Why is it not working?

CodePudding user response:

Why is it not working?

Because axis=1 is for std per columns, but you count Series, df_stats.distance, there is no columns so error raised.

If use std of column, output is scalar:

print (df_stats.distance.std()) 

df_stats['std'] = df_stats.distance.std()

If need processing per multiple columns then axis=1 count std per rows:

df_stats['std'] = df_stats[['distance','a1/a2','mean_distance']].std(axis=1)

If need std per some datetimes, e.g. days:

df_stats['std'] = df_stats.groupby(pd.Grouper(freq='d')).distance.transform('std')
  • Related