I've a DataFrame that looks like this
Name | Result_cat_1 | Result1 | Result_cat_2 | Result_2 | Result_cat_3 | Result_3 |
---|---|---|---|---|---|---|
Andrew | Rock | 12 plays | Paper | 4 plays | Scissors | 45 plays |
John | Paper | 8 plays | Scissors | 15 plays | Rock | 76 plays |
Ronald | Scissors | 6 plays | Rock | 3 plays | Paper | 23 plays |
The end result should be like this:
Name | Rock | Paper | Scissors |
---|---|---|---|
Andrew | 12 plays | 4 plays | 45 plays |
John | 76 plays | 8 plays | 15 plays |
Ronald | 3 plays | 23 plays | 6 plays |
I've tried
df['Rock'] = df[Result1].where(df[Result_cat_1] == 'Rock')
df['Rock'] = df[Result2].where(df[Result_cat_2] == 'Rock')
df['Rock'] = df[Result3].where(df[Result_cat_3] == 'Rock')
but the column contains only the df[Result_cat_3]
overwriting the previous 2
CodePudding user response:
We could stack
the "Results_cat" columns and "Results" columns separately; build a DataFrame with the stacked Series; then pivot
:
df = df.set_index('Name')
out = (df.filter(like='Result_cat')
.stack()
.droplevel(1)
.reset_index(name='result')
.assign(play=df[['Result1', 'Result_2', 'Result_3']]
.stack().to_numpy())
.pivot('Name', 'result', 'play')
.rename_axis(index=[None], columns=[None]))
Output:
Paper Rock Scissors
Andrew 4 plays 12 plays 45 plays
John 8 plays 76 plays 15 plays
Ronald 23 plays 3 plays 6 plays
CodePudding user response:
In your case do
s1 = df.set_index('Name').iloc[:,::2].stack()
s2 = df.set_index('Name').iloc[:,1::2].stack()
out = pd.crosstab(s1.index.get_level_values(0),s1.values,s2.values,aggfunc='first')
Out[764]:
col_0 Paper Rock Scissors
row_0
Andrew 4 plays 12 plays 45 plays
John 8 plays 76 plays 15 plays
Ronald 23 plays 3 plays 6 plays