I have this data set with the following values counts for the column Age
:
>>> game['Age'].value_counts()
Between 18 -25 131
Between 26 - 30 21
Under 18 10
31 or more 7
Name: Age, dtype: int64
I´m trying to create a regrouping of values with 2 groups for this column 'Age' :
- <=25 // (grouping Between 18 -25 Under 18 )
- >=26 // (grouping Between 26 - 30 31 or more )
I have been trying to play groupby function but no good result yet. Can you please help?
CodePudding user response:
You can use np.select
:
mapping = {
'Between 18 -25': '<=25',
'Under 18': '<=25',
'Between 26 - 30': '>=26',
'31 or more': '>=26',
}
df['Age'] = np.select([df['Age'] == k for k in mapping.keys()], mapping.values())
Or just use .loc
:
df.loc[df['Age'] == 'Between 18 -25', 'Age'] = '<=25'
df.loc[df['Age'] == 'Under 18', 'Age'] = '<=25'
df.loc[df['Age'] == 'Between 26 - 30', 'Age'] = '>=26'
df.loc[df['Age'] == '31 or more', 'Age'] = '>=26'
Or isin
:
df.loc[df['Age'].isin(['Between 18 -25', 'Under 18']), 'Age'] = '<=25'
df.loc[df['Age'].isin(['Between 26 - 30', '31 or more']), 'Age'] = '>=26'
CodePudding user response:
Try to explain the problem in a better way or share your code for better understanding.
This can be acheived by using np.where()
method
import numpy as np
game["How old are you?"] = np.where(((game["How old are you?"]=="Between 18 -25") |
(game["How old are you?"]=="Under 18")), "Under 25", "26 or more")
game["How old are you?"].value_counts()
You should see an output something like this
Under 25 ...
26 or more ...