I have two dataframes, both of which contain matching columns A and B. However the number of occurrences is not the same in each dataframe. Dataframe two contains a third column, which I would like to bring to my first dataframe and match the values on where columns A and B are the same.
The dataframes are not the same shape or size, DF #1 has a ton more columns and rows than displayed so I can't just lift it over.
Example:
DF #1
Ethnicity Region
Asian West
Asian West
Asian North
Black West
Black West
Black West
Mixed South
Mixed West
Mixed West
Mixed South East
DF #2
Ethnicity Region Population
Asian South East 278372
Asian East 32992
Asian South 33503
Asian East 86736
Asian East 58871
Asian North 66270
Black East 117442
Black East 69925
Black West 33614
Black West 13903
So essentially, I would like to do a V look up type function and create a new column in the first dataframe, which would tell me the population from the second dataframe.
So far I have done a groupby function which successfully sums the total number of residents per region in the second dataframe, but I am not sure how to move this to the first dataframe.
The reason behind this task is dataframe #1 contains a ton of other information which would benefit from the population figures from the second dataframe.
Any pointers/relevant documentation would be very helpful. Thanks.
CodePudding user response:
You can just do merge
df2 = df2.groupby(['Ethnicity', 'Region']).sum().reset_index()
df1 = df1.merge(df2,how='left')