Home > Software design >  Pandas: find column min and max ranges by group
Pandas: find column min and max ranges by group

Time:02-12

I have a data frame like so:

df = pd.DataFrame({
    'A':np.random.random(20),
    'B':np.random.random(20),
    'Gen': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4]
})

Is there a simple way with pandas to find the range (min and max values) in each column by group 'Gen'? In real life, I can have lots of columns A, B, ... (>20). The result could look something like:

Var   Gen    Min   Max
A     1      x     x
B     1      x     x
A     2      x     x
...

CodePudding user response:

You can melt to reshape, then groupby agg to aggregate per group:

(df.melt(id_vars='Gen', var_name='Var')
   .groupby(['Var', 'Gen'])['value']
   .agg(['min', 'max'])
   #.reset_index()  # uncomment to have Gen/Var as columns
)

Alternatively, aggregate first, then stack:

df.groupby('Gen').agg(['min', 'max']).stack(level=0)

output:

              min       max
Var Gen                    
A   1    0.009241  0.813241
    2    0.142218  0.751106
    3    0.005829  0.991914
    4    0.033108  0.579354
B   1    0.241094  0.990492
    2    0.030335  0.848735
    3    0.209076  0.785530
    4    0.277970  0.962307
  • Related