I am kinda new to pandas and am struggling to find a solution to my problem.
My dataframe looks something like this :
Id | num | A | B | C |
---|---|---|---|---|
1 | 3.5 | 0 | 0 | 1 |
1 | 10 | 0 | 1 | 0 |
1 | 30 | 1 | 0 | 0 |
2 | 3.6 | 0 | 0 | 1 |
2 | 11 | 0 | 1 | 0 |
2 | 20 | 1 | 0 | 0 |
When the value from the column "Id" is the same, I want to copy the value from the "num" column to the column that has "1" as a value in column A, B or C and then merge them into one row and remove "num" column completely.
Kinda like this :
Id | num | A | B | C |
---|---|---|---|---|
1 | 3.5 | 0 | 0 | 3.5 |
1 | 10 | 0 | 10 | 0 |
1 | 30 | 30 | 0 | 0 |
2 | 3.6 | 0 | 0 | 3.6 |
2 | 11 | 0 | 11 | 0 |
2 | 20 | 20 | 0 | 0 |
And finally :
Id | A | B | C |
---|---|---|---|
1 | 30 | 10 | 3.5 |
2 | 20 | 11 | 3.6 |
CodePudding user response:
First step, this is a simple multiplication, use mul
:
cols = ['A', 'B', 'C']
df[cols] = df[cols].mul(df['num'], axis=0)
Second step, use groupby
max
:
df.groupby('Id', as_index=False)[cols].max()
All steps without intermediate:
cols = ['A', 'B', 'C']
(df[cols].mul(df['num'], axis=0)
.groupby(df['Id']).max()
.reset_index()
)
Output 1:
Id num A B C
0 1 3.5 0.0 0.0 3.5
1 1 10.0 0.0 10.0 0.0
2 1 30.0 30.0 0.0 0.0
3 2 3.6 0.0 0.0 3.6
4 2 11.0 0.0 11.0 0.0
5 2 20.0 20.0 0.0 0.0
Output 2:
Id A B C
0 1 30.0 10.0 3.5
1 2 20.0 11.0 3.6