Home > database >  Filling column of dataframe based on 'groups' of values of another column
Filling column of dataframe based on 'groups' of values of another column

Time:12-16

I am trying to fill values of a column based on the value of another column. Suppose I have the following dataframe:

import pandas as pd
data = {'A': [4, 4, 5, 6],
        'B': ['a', np.nan, np.nan, 'd']}
df = pd.DataFrame(data)

And I would like to fill column B but only if the value of column A equals 4. Hence, all rows that have the same value as another in column A should have the same value in column B (by filling this).

Thus, the desired output should be:

data = {'A': [4, 4, 5, 6],
        'B': ['a', a, np.nan, 'd']}
df = pd.DataFrame(data)

I am aware of the fillna method, but this gives the wrong output as the third row also gets the value 'A' assigned:

df['B'] = fillna(method="ffill", inplace=True)
data = {'A': [4, 4, 5, 6],
        'B': ['a', 'a', 'a', 'd']}
df = pd.DataFrame(data)

How can I get the desired output?

CodePudding user response:

Try this:

df['B'] = df.groupby('A')['B'].ffill()

Output:

>>> df
   A    B
0  4    a
1  4    a
2  5  NaN
3  6    d
  • Related