I have the following dataframe
df1 = pd.DataFrame({'Parent': ['Stay home', "Stay home","Stay home", 'Go swimming', "Go swimming","Go swimming"],
'Parent1': ['Severe weather', "Severe weather", "Severe weather", 'Not Severe weather', "Not Severe weather", "Not Severe weather"],
'Child': ["Extreme rainy", "Extreme windy", "Severe snow", "Sunny", "some windy", "No snow"]
})
Parent Parent1 Child
0 Stay home Severe weather Extreme rainy
1 Stay home Severe weather Extreme windy
2 Stay home Severe weather Severe snow
3 Go swimming Not Severe weather Sunny
4 Go swimming Not Severe weather some windy
5 Go swimming Not Severe weather No snow
I want to cast the Child
column values to a list groupby
the values of Parent1
Expected outcome:
Parent Parent1 Child List1
0 Stay home Severe weather Extreme rainy [Extreme rainy,Extreme windy,Severe snow]
1 Stay home Severe weather Extreme windy
2 Stay home Severe weather Severe snow
3 Go swimming Not Severe weather Sunny [Sunny,some windy,No snow]
4 Go swimming Not Severe weather some windy
5 Go swimming Not Severe weather No snow
One solution that I tried cannot return me the expected outcome:
df1["list1"] = df1.groupby('Parent1')['Child'].transform(lambda x: x.tolist())
Any ideas?
CodePudding user response:
here is one way to do it.
Groupby Parent1 and create a list using apply. Then map to only the first occurrence by filtering the df with unique values
df1['list']= df1[~df1.duplicated(subset=['Parent1'], keep='first')
]['Parent1'].map(df1.groupby('Parent1')['Child'].apply(list) )
df1.fillna('', inplace=True)
df1
Parent Parent1 Child list
0 Stay home Severe weather Extreme rainy [Extreme rainy, Extreme windy, Severe snow]
1 Stay home Severe weather Extreme windy
2 Stay home Severe weather Severe snow
3 Go swimming Not Severe weather Sunny [Sunny, some windy, No snow]
4 Go swimming Not Severe weather some windy
5 Go swimming Not Severe weather No snow