I have a dataframe of 2000 columns and 1 row. I want to calculate the STD for 50 columns at a time. Any ideas on how to do it?
thank you,
CodePudding user response:
If need count first 50 columns use:
out = df.iloc[0, :50].std()
For each 50 values use:
s = df.iloc[0]
out = s.groupby(np.arange(len(s)) // 50).std()
Sample:
np.random.seed(202206)
df = pd.DataFrame([np.random.randint(20, size=20)]).add_prefix('c')
print(df)
c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 \
0 7 8 12 15 2 10 3 3 6 7 5 19 1 10 15 12 15
c17 c18 c19
0 9 18 10
out = df.iloc[0, :5].std()
print(out)
4.969909455915671
s = df.iloc[0]
out = s.groupby(np.arange(len(s)) // 5).std()
print(out)
0 4.969909
1 2.949576
2 7.280110
3 3.701351
Name: 0, dtype: float64