Home > Net >  Numpy, pandas merge rows
Numpy, pandas merge rows

Time:03-29

Im working on numpy, pandas and need to "merge" rows. I have column martial-status and there are things like this:

'Never-married', 'Divorced', 'Separated', 'Windowed'

and:

'Married-civ-spouse','Married-spouse-absent', 'Married-AF-spouse'

Im wondering how to merge them to just 2 rows, for the first 4 to single and for the second one's in relationship. I need it for one hot encoding later.

enter image description here

And for sample output the martial-status should be just single or in relationship adequately to what i mention before

CodePudding user response:

You can use pd.Series.map to convert certain values to other. For this you need a dictionary, that assigns each value with a new value. The values not presented in the dictionary will be replaced with NaN

married_map = {
    status:'Single' 
    for status in ['Never-married', 'Divorced', 'Separated', 'Widowed']}
married_map.update({
    status:'In-relationship' 
    for status in ['Married-civ-spouse','Married-spouse-absent', 'Married-AF-spouse']})
df['marital-status'].map(married_map)
  • Related