I have data like the following. Each row is for a specific colour, associated with different numbers:
Color | num1 | num2 |
---|---|---|
red | 1 | 2 |
red | 1 | na |
blue | 2 | na |
blue | 2 | 3 |
yellow | 1 | 4 |
yellow | 1 | na |
I want to use forward fill on the num2
column, but only forward fill within the same colors.
For example, only fill the num2
of the first blue
row with the previous row's num2
, if that previous row was also blue
.
Expected result:
Color | num1 | num2 |
---|---|---|
red | 1 | 2 |
red | 1 | 2 |
blue | 2 | na |
blue | 2 | 3 |
yellow | 1 | 4 |
yellow | 1 | 4 |
I have tried the following code:
for color in df['color'].unique():
df[df['color'] == color]['num2']=df[df['color'] == color]['num2'].fillna(method='ffill')
I have also tried with inplace=True
and it does not work.
CodePudding user response:
df = df.replace('na', np.nan)
df['num2'] = df.groupby('Color')['num2'].ffill()
Output:
>>> df
Color num1 num2
0 red 1 2
1 red 1 2
2 blue 2 NaN
3 blue 2 3
4 yellow 1 4
5 yellow 1 4
CodePudding user response:
Is this what you are looking for?
df[['Color']].join(df.groupby('Color').ffill())