I have a dataframe like below
df = pd.DataFrame({'col1': ['A', 'A', 'B', 'C', 'D', 'D'],
'col2': [1,0,1,0,0,1]})
col1 | col2 |
---|---|
A | 1 |
A | 0 |
B | 1 |
C | 0 |
D | 0 |
D | 1 |
If there's one row from col2 = 1, I want to make all col2 rows = 1 for the same value col1 value. The table I'd want is as below:
col1 | col2 |
---|---|
A | 1 |
A | 1 |
B | 1 |
C | 0 |
D | 1 |
D | 1 |
Any help is much appreciated!
CodePudding user response:
You can chunk it in two steps:
Get the rows where col2 is 1:
filters = df.loc[df.col2.eq(1), 'col1']
Assign the new values to rows, where col1 is in filters
:
df.loc[df.col1.isin(filters), 'col2'] = 1
df
col1 col2
0 A 1
1 A 1
2 B 1
3 C 0
4 D 1
5 D 1
CodePudding user response:
If you only have 0 and 1 in the column, you can use groupby
transform
:
df['col2'] = df.col2.groupby(df.col1).transform('max')
df
col1 col2
0 A 1
1 A 1
2 B 1
3 C 0
4 D 1
5 D 1
In general you can pass a lambda to return 1 if any value is 1:
df['col2'] = df.col2.groupby(df.col1).transform(lambda g: 1 if (g == 1).any() else g)
df
col1 col2
0 A 1
1 A 1
2 B 1
3 C 0
4 D 1
5 D 1