Home > Net >  Python - lookup a value from another df multiple times
Python - lookup a value from another df multiple times

Time:06-17

I have the following two dataframes:

   prod_id       land_ids
0  1             [1,2]
1  2             [1]
2  3             [2,3,4]
3  4             []
4  5             [3,4]

   land_id       land_desc
0  1             germany
1  2             austria
2  3             switzerland
3  4             italy

Bascially, I want all numbers in column land_ids to individually join the other df.

The result should look something like this:

   prod_id       land_ids  list_land
0  1             [1,2]     germany austria
1  2             [1]       germany
2  3             [2,3,4]   austria switzerland italy
3  4             []     
4  5             [3,4]     switzerland italy

Preferrably, the column list_land is one string where the lands are concatenated. But I would also be fine with getting a list as a result.

Any idea on how to do this?

Here is my code for creating the df:

data_prod = {'prod_id': [1,2,3,4,5], 'land_ids': [[1,2],[1],[2,3,4],[1,3],[3,4]]}
prod_df = pd.DataFrame(data_prod)

data_land = {'land_id': [1,2,3,4], 'land_desc': ['germany', 'austria', 'switzerland', 'italy']}
land_df = pd.DataFrame(data_land)

EDIT: what do I have to add if one value of land_ids is empty?

CodePudding user response:

you can use the apply method:

prod_df['list_land'] = prod_df['land_ids'].apply(lambda x: [land_df.loc[land_df['land_id'] == y]['land_ids'].values[0] for y in x])

In this case, the list_land column is a list. You can use the following code if you want it to be a string.

prod_df['list_land'] = prod_df['land_ids'].apply(lambda x: ' '.joind([land_df.loc[land_df['land_id'] == y]['land_ids'].values[0] for y in x]))

CodePudding user response:

df1 = pd.DataFrame({"prod_id":[1,2,3,4,5],"land_ids":[[1,2],[1],[2,3,4],[1,3],[3,4]]})
df2 = pd.DataFrame({"land_id":[1,2,3,4],"land_ids":["germany","austria","switzerland","italy"]})

df2 = df2.set_index('land_id', drop=True)
df1['list_land'] = df1['land_ids'].apply(lambda x: [df2.at[ids, 'land_desc'] for ids in x])

If you want to get list_land as a string, than you can do like this.

df1['list_land'] = df1['land_ids'].apply(lambda x: " ".join([df2.at[ids, 'land_desc'] for ids in x]))

CodePudding user response:

Maybe something like this:

import pandas as pd 


df1 = pd.DataFrame({"prod_id":[1,2,3,4,5],"land_ids":[[1,2],[1],[2,3,4],[1,3],[3,4]]})
df2 = pd.DataFrame({"land_id":[1,2,3,4],"land_ids":["germany","austria","switzerland","italy"]})

list_land = []

for index, row in df1.iterrows():
    list_land.append([row2.land_ids for land_id in row["land_ids"] for _, row2 in df2.iterrows() if row2.land_id == land_id])
df1["list_land"] = list_land
  • Related