Home > Software design >  How to average n adjacent columns together in python pandas dataframe?
How to average n adjacent columns together in python pandas dataframe?

Time:10-18

I have a dataframe that is a histogram with 2000 bins, with a column for each bin. I need to reduce it down to a quarter of the size - 500 bins.

Let's say we have the original dataframe:

A B C D E F G H

1 1 1 1 2 2 2 2

I want to reduce it to a new quarter width dataframe:

A B

1 2

where in the new dataframe, A is the average of A B C D/4 in the original dataframe.

Feels like it should be easy, but can't work out how to do it! Cheers :)

CodePudding user response:

Assuming you want to group the first 4 and last 4 columns (or any number of columns 4 by 4):

out = df.groupby(np.arange(df.shape[1])//4, axis=1).mean()

ouput:

     0    1
0  1.0  2.0

If you further want to relabel the columns A/B:

out = (df.groupby(np.arange(df.shape[1])//4, axis=1).mean()
         .set_axis(['A', 'B'], axis=1)
       )

output:

     A    B
0  1.0  2.0
  • Related