Home > Software engineering >  Performing a V-lookup type operation on pandas for two datasets
Performing a V-lookup type operation on pandas for two datasets

Time:11-07

I have two dataframes, both of which contain matching columns A and B. However the number of occurrences is not the same in each dataframe. Dataframe two contains a third column, which I would like to bring to my first dataframe and match the values on where columns A and B are the same.

The dataframes are not the same shape or size, DF #1 has a ton more columns and rows than displayed so I can't just lift it over.

Example:

DF #1

Ethnicity Region
Asian     West
Asian     West
Asian     North
Black     West
Black     West
Black     West
Mixed     South
Mixed     West
Mixed     West
Mixed     South East

DF #2

Ethnicity   Region    Population
Asian      South East   278372
Asian        East       32992
Asian        South      33503
Asian        East       86736
Asian        East       58871
Asian        North      66270
Black        East       117442
Black        East       69925
Black        West       33614
Black        West       13903

So essentially, I would like to do a V look up type function and create a new column in the first dataframe, which would tell me the population from the second dataframe.

So far I have done a groupby function which successfully sums the total number of residents per region in the second dataframe, but I am not sure how to move this to the first dataframe.

The reason behind this task is dataframe #1 contains a ton of other information which would benefit from the population figures from the second dataframe.

Any pointers/relevant documentation would be very helpful. Thanks.

CodePudding user response:

You can just do merge

df2 = df2.groupby(['Ethnicity', 'Region']).sum().reset_index()
df1 = df1.merge(df2,how='left')
  • Related