I'm looking for the correct way to replacing the label to matched column in my dataframe but I don't get the code working. Is there any solution?
MY DATAFRAME
labItemsNameRef label
0 FBS decrease
1 FBS decrease
2 FBS increase
3 HbA1c decrease
4 Creatinine changeless
... ... ...
123901 FBS decrease
123902 HbA1c increase
123903 Micro Creatinine changeless
123904 DTX ก่อนอาหาร increase
123905 Urine Creatinine changeless
df = df.assign(
FBS = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBS'),
HbA1c = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'HbA1c'),
DTX = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'DTX'),
BUN = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'BUN'),
Creatinine = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'Creatinine'))
but I got this error
FBX = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBX'),
^
SyntaxError: invalid syntax
EXPECTED OUTPUT
labItemsNameRef label FBS HbA1c Creatinine BUN DTX
0 FBS decrease decrease NaN NaN NaN NaN
1 FBS decrease decrease NaN NaN NaN NaN
2 FBS increase increase NaN NaN NaN NaN
3 HbA1c decrease NaN decrease NaN NaN NaN
4 Creatinine changeless NaN NaN changeless NaN NaN
... ... ... ... ... ... ... ...
123901 FBS decrease decrease NaN NaN NaN NaN
123902 HbA1c increase NaN increase NaN NaN NaN
123903 Micro Creatinine changeless NaN NaN NaN NaN NaN
123904 DTX ก่อนอาหาร increase NaN NaN NaN NaN NaN
123905 Urine Creatinine changeless NaN NaN NaN NaN NaN
CodePudding user response:
Use get_dummies
for indicator column and set values of label
in numpy.where
:
m = pd.get_dummies(df['labItemsNameRef'], dtype=bool)
df[m.columns] = np.where(m, df[['label']], np.nan)
print (df)
Your solution is slow, because loops in apply, but possible with add else
statement and axis=1
:
df = df.assign(FBS = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBS' else np.nan, axis=1))