I need your help for the following task: I have the following dataframe:
test = {'Col1':[2,5],
'Col2':[5,7],
'Col_List':[['One','Two','Three','Four','Five'], ['Two', 'Four']],
'One':[0,0],
'Two':[0,0],
'Three':[0,0],
'Four':[0,0],
'Five':[0,0],}
df=pd.DataFrame.from_dict(test)
df
which looks like:
Col1 | Col2 | Col_List | One | Two | Three | Four | Five |
---|---|---|---|---|---|---|---|
2 | 5 | [One, Two, Three, Four, Five] | 0 | 0 | 0 | 0 | 0 |
5 | 7 | [Two, Four] | 0 | 0 | 0 | 0 | 0 |
I need to inspect the list in Col_List
and set, depending on which item is in the list, the value of column Col1
in the specific column (One
, Two
, Three
, Four
or Five
).
Now I would like to have the following result:
Col1 | Col2 | Col_List | One | Two | Three | Four | Five |
---|---|---|---|---|---|---|---|
2 | 5 | [One, Two, Three, Four, Five] | 2 | 2 | 2 | 2 | 2 |
5 | 7 | [Two, Four] | 0 | 5 | 0 | 5 | 0 |
CodePudding user response:
exploded = df.explode("Col_List")
df.update(pd.get_dummies(exploded["Col_List"])
.mul(exploded["Col1"], axis="rows")
.groupby(level=0).sum())
- explode lists' elements to their own rows
- get 1-hot representation of "One", "Two" etc.
- multiply it with the (exploded) "Col1" values
- 1/0 values will act as a selector
- then undo the explosion: groupby & sum
- lastly update the original frame's "One", "Two"... columns with this
to get
>>> df
Col1 Col2 Col_List One Two Three Four Five
0 2 5 [One, Two, Three, Four, Five] 2 2 2 2 2
1 5 7 [Two, Four] 0 5 0 5 0
CodePudding user response:
Let's try explode the data, unstack it, then assign back:
s = (df[['Col1', 'Col_List']].explode('Col_List')
.set_index('Col_List', append=True)['Col1']
.unstack().fillna(df)
)
df.loc[s.index, s.columns] = s
Output:
Col1 Col2 Col_List One Two Three Four Five
0 2 5 [One, Two, Three, Four, Five] 2.0 2.0 2.0 2.0 2.0
1 5 7 [Two, Four] 0.0 5.0 0.0 5.0 0.0
CodePudding user response:
You could use .str.join()
and str.get_dummies()
df.assign(**df['Col_List'].str.join('|').str.get_dummies().mul(df['Col1'],axis=0))
Output:
Col1 Col2 Col_List One Two Three Four Five
0 2 5 [One, Two, Three, Four, Five] 2 2 2 2 2
1 5 7 [Two, Four] 0 5 0 5 0