I have a dataframe that is a histogram with 2000 bins, with a column for each bin. I need to reduce it down to a quarter of the size - 500 bins.
Let's say we have the original dataframe:
A B C D E F G H
1 1 1 1 2 2 2 2
I want to reduce it to a new quarter width dataframe:
A B
1 2
where in the new dataframe, A is the average of A B C D/4 in the original dataframe.
Feels like it should be easy, but can't work out how to do it! Cheers :)
CodePudding user response:
Assuming you want to group the first 4 and last 4 columns (or any number of columns 4 by 4):
out = df.groupby(np.arange(df.shape[1])//4, axis=1).mean()
ouput:
0 1
0 1.0 2.0
If you further want to relabel the columns A/B:
out = (df.groupby(np.arange(df.shape[1])//4, axis=1).mean()
.set_axis(['A', 'B'], axis=1)
)
output:
A B
0 1.0 2.0