The most important from the table below is to fill missing NILs from the id
column based on anything that matches in the type
column (see end result).
Current table:
id | type |
---|---|
26235 | fruit |
26235 | veggie |
32256 | NIL |
NIL | pawn |
12415 | NIL |
NIL | fruit |
NIL | veggie |
24145 | pawn |
End result:
id | type |
---|---|
26235 | fruit |
26235 | veggie |
32256 | NIL |
24145 | pawn |
12415 | NIL |
26235 | fruit |
26235 | veggie |
24145 | pawn |
I am assuming this can be done without loops. Thank you for your help!
CodePudding user response:
here is one way to do it
create a dictionary of type and id, and then using loc to fill in the missing values
d=df[df['id'].ne('NIL')].drop_duplicates(subset=['type']).to_dict()
df.loc[df['id'].eq('NIL'), 'id'] = df['type'].map(d)
df
id type
0 26235 fruit
1 26235 veggie
2 32256 NIL
3 24145 pawn
4 12415 NIL
5 26235 fruit
6 26235 veggie
7 24145 pawn
CodePudding user response:
You can do:
d = df.loc[(df.id != 'NIL') & (df.type != 'NIL')].set_index('type')['id']
df.loc[df.id =='NIL', 'id'] = df.loc[df.id == 'NIL', 'type'].map(d)
print(df):
id type
0 26235 fruit
1 26235 veggie
2 32256 NIL
3 24145 pawn
4 12415 NIL
5 26235 fruit
6 26235 veggie
7 24145 pawn