Hi I need to update specific columns in DataFrame based on the dictionary. My initial DataFrame is this
Date | Var_1 | Var_2 | Var_3 | Var_4 |
---|---|---|---|---|
01/01/2022 | 100 | Yes | Yes | 104 |
02/01/2022 | 100 | Yes | Yes | 104 |
03/01/2022 | 100 | Yes | Yes | 104 |
04/01/2022 | 100 | Yes | Yes | 104 |
05/01/2022 | 100 | Yes | No | 104 |
06/01/2022 | 100 | Yes | No | 104 |
07/01/2022 | 100 | Yes | No | 104 |
08/01/2022 | 100 | No | Yes | 104 |
and my nested dictionary is this (based on that I need to update this data frame)
my_dict = {
"01/01/2022" : { "Var_2": "Yes","Var_3": "No"},
"02/01/2022" : { "Var_2": "Yes","Var_3": "No"},
"03/01/2022" : { "Var_2": "Yes","Var_3": "Yes"},
"05/01/2022" : { "Var_2": "No", "Var_3": "Yes"},
"06/01/2022" : { "Var_2": "No", "Var_3": "Yes"}
}
My desired output would be
Date | Var_1 | Var_2 | Var_3 | Var_4 |
---|---|---|---|---|
01/01/2022 | 100 | Yes | No | 104 |
02/01/2022 | 100 | Yes | No | 104 |
03/01/2022 | 100 | Yes | Yes | 104 |
04/01/2022 | 100 | Yes | Yes | 104 |
05/01/2022 | 100 | No | Yes | 104 |
06/01/2022 | 100 | No | Yes | 104 |
07/01/2022 | 100 | Yes | No | 104 |
08/01/2022 | 100 | No | Yes | 104 |
Tried with .replace(my_dict)
but it didn't worked out.
CodePudding user response:
One option is to convert my_dict
to a DataFrame and update df
with it:
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')
tmp = pd.DataFrame.from_dict(my_dict, orient='index')
tmp.index = pd.to_datetime(tmp.index)
df.update(tmp)
df = df.reset_index()
or use combine_first
:
tmp = pd.DataFrame.from_dict(my_dict, orient='index')
tmp.index = pd.to_datetime(tmp.index)
df = tmp.combine_first(df.set_index('Date')).reset_index().rename(columns={'index':'Date'})
Output:
Date Var_1 Var_2 Var_3 Var_4
0 01/01/2022 100 Yes No 104
1 02/01/2022 100 Yes No 104
2 03/01/2022 100 Yes Yes 104
3 04/01/2022 100 Yes Yes 104
4 05/01/2022 100 No Yes 104
5 06/01/2022 100 No Yes 104
6 07/01/2022 100 Yes No 104
7 08/01/2022 100 No Yes 104