Here is my dataframe:
a b c d
1 X yellow None
1 Y yellow None
1 Z blue None
Here is my incoming data:
data = [
{'a': 1, 'b': "X", 'c': 'red', 'd': True},
{'a': 1, 'b': "Z", 'c': 'purple', 'd': False},
]
I would like to map the data onto the dataframe, updating columns 'c' and 'd' where columns 'a' and 'b' match the data.
The final result would look like this:
a b c d
1 X red True
1 Y yellow None
1 Z purple False
What is the most performant way to do this?
CodePudding user response:
I would probably concatenate your columns of interest into a single "composite key" column and then use that to map. I'm sure there's other fancy ways of doing it but keeping it simple might be best.
CodePudding user response:
You could use combine_first()
df2 = pd.DataFrame(data)
df2.set_index(['a','b']).combine_first(df.set_index(['a','b'])).reset_index()
Output:
a b c d
0 1 X red True
1 1 Y yellow None
2 1 Z purple False
CodePudding user response:
You could do:
pd.DataFrame(data).set_index(['a','b'])\
.combine_first(df.set_index(['a', 'b']))\
.reset_index().reindex_like(df)
a b c d
0 1 X red True
1 1 Y yellow None
2 1 Z purple False