How to repeat a value in a column based on another-CodePudding

I have the following dataframe:

import pandas as pd

df = pd.DataFrame({
    'a': [1, 1, 2, 2],
    'b': [None, 'w', None, 'z']
})

a	b
1	None
1	'w'
2	None
2	'z'

And I want to repeat the values that are not None in column 'b', but based on the value in column 'a'.

At the end I would have this dataframe:

a	b
1	'w'
1	'w'
2	'z'
2	'z'

CodePudding user response：

The logic is not fully clear on how you would like to generalize, but you could bfill/ffill per group:

df['b'] = df.groupby('a')['b'].apply(lambda x: x.bfill().ffill())

output:

CodePudding user response：

it's a bit tricky but it works. Basically what happen is that for each subsample of 'a' we are going to fill na values with the column 'b'. I'm assuming that for each element of 'a' there exist only one value of 'b' and no more

df = pd.DataFrame({
        'a': [1, 1, 2, 2],
        'b': [None, 'w', None, 'z']})

df

        a   b
    0   1   None
    1   1   w
    2   2   None
    3   2   z
    
for i in df['a'].unique():
    df[df['a']==i] = df[df['a']==i].fillna(df[df['a']==i].dropna()['b'].iloc[0])
        
    
df
        a   b
    0   1   w
    1   1   w
    2   2   z
    3   2   z