I have this dataframe:
id result.value.text result.value.labels result.id result.from_id result.to_id
0 793 skin melanoma indication 5jSiC_n3IM NaN NaN
1 793 proteinase protein Lso-iCCHar NaN NaN
2 793 plasminogen activator protein _17D_kE5zf NaN NaN
3 793 NaN NaN NaN 5jSiC_n3IM Lso-iCCHar
4 793 NaN NaN NaN 5jSiC_n3IM _17D_kE5zf
I want to change the values of result.from_id
and result.to_id
columns, and instead of having the values of the result.id
column, to replace them with the corresponding values of the result.value.text
column.
Wanted Output
id result.value.text result.value.labels result.from_id result.to_id
0 793 skin melanoma indication NaN NaN
1 793 proteinase protein NaN NaN
2 793 plasminogen activator protein NaN NaN
3 793 NaN NaN skin melanoma proteinase
4 793 NaN NaN skin melanoma plasminogen activator
Can someone help?
CodePudding user response:
Create dictionary with remove missing rows per result.id, result.value.text
and then mapping both columns:
d = df.dropna(subset=['result.id','result.value.text']).set_index('result.id')['result.value.text'].to_dict()
cols = ['result.from_id','result.to_id']
df[cols] = df[cols].apply(lambda x: x.map(d))