I have a dataframe I would like to transform. I want to merge the columns together and have the values that have the same id appear as separate rows.
So From This:
id children1 children2 children3
1 No Children NaN NaN
2 12-16 years 17 years Nan
3 No Children Nan Nan
4 5-7 years 8-11 years 12-16 years
To This:
id children
1 No Children
2 12-16 years
2 17 years
3 No Children
4 5-7 years
4 8-11 years
4 12-16 years
Is there an easy way to do this?
Data:
{'id': [1, 2, 3, 4],
'children1': ['No Children', '12-16 years', 'No Children', '5-7 years'],
'children2': [nan, '17 years', nan, '8-11 years'],
'children3': [nan, nan, nan, '12-16 years']}
CodePudding user response:
I think you're looking for stack
method, which converts columns to index. Now to get the exact shape, you'll need to set_index
with "id" first (so that "id" is kept as index when stack
ing).
out = df.set_index('id').stack().droplevel(1).rename('children').reset_index()
Output:
id children
0 1 No Children
1 2 12-16 years
2 2 17 years
3 3 No Children
4 4 5-7 years
5 4 8-11 years
6 4 12-16 years
CodePudding user response:
new = (df.set_index('id').agg(list,1)#Put all row values into a list except id
.explode()#Ensure each element in a list is put in a row
.replace('Nan', np.nan)# Make Nan -> NaN
.dropna()#Drop all NaNs
.to_frame('Children')#Rename column 0 to Childresn
)
outcome
Children
id
1 NoChildren
2 12-16years
2 17 years
3 NoChildren
4 5-7years
4 8-11years
4 12-16years