I have this dataset. I would like to fill out NA value with the same code char in the same group.
in this example the first NA will be S, and the second one will be F
Thank you,
df = {'Key': ['111*1', '111*2','111*3', '222*1','222*2', '333*1','333*2', '333*3','333*4', '444*1'],
'code': ['S', 'S','NA', 'M','M', 'F','F', 'F','NA', 'C']}
# Create DataFrame
df = pd.DataFrame(df)
df[['Keya', 'Keyb']] = df['Key'].str.split('\\*', expand=True, regex=True)
print(df)
CodePudding user response:
You can use:
s = df['code'].replace('NA', np.nan)
df['code'] = s.fillna(s.groupby(df['Key'].str.extract('([^*] )*', expand=False))
.transform('first')
)
If you already have Keya:
s = df['code'].replace('NA', np.nan)
df['code'] = s.fillna(s.groupby(df['Keya']).transform('first'))
Output:
Key code
0 111*1 S
1 111*2 S
2 111*3 S
3 222*1 M
4 222*2 M
5 333*1 F
6 333*2 F
7 333*3 F
8 333*4 F
9 444*1 C