Home > Back-end >  How do I create a new column of max values of a column(corresponding to specific name) using pandas?
How do I create a new column of max values of a column(corresponding to specific name) using pandas?

Time:05-19

I'm wondering if it is possible to use Pandas to create a new column for the max values of a column (corresponding to different names, so that each name will have a max value).

For an example:

name    value    max
Alice    1        9
Linda    1        1
Ben      3        5
Alice    4        9
Alice    9        9
Ben      5        5
Linda    1        1

So for Alice, we are picking the max of 1, 4, and 9, which is 9. For Linda max(1,1) = 1, and for Ben max(3,5) = 5.

I was thinking of using .loc to select the name == "Alice", then get the max value of these rows, then create the new column. But since I'm dealing with a large dataset, this does not seem like a good option. Is there a smarter way to do this so that I don't need to know what specific names?

CodePudding user response:

groupby and taking a max gives the max by name, which is then merged with the original df

df.merge(df.groupby(['name'])['value'].max().reset_index(), 
         on='name').rename(
                    columns={'value_x' : 'value', 
                             'value_y' : 'max'})
    name    value   max
0   Alice   1   9
1   Alice   4   9
2   Alice   9   9
3   Linda   1   1
4   Linda   1   1
5   Ben     3   5
6   Ben     5   5

CodePudding user response:

You could use transform or map

df['max'] = df.groupby('name')['value'].transform('max')

or

df['max'] = df['name'].map(df.groupby('name')['value'].max())
  • Related