Home > Software engineering >  Choose values from a dataframe based on values in a second data frame without looping
Choose values from a dataframe based on values in a second data frame without looping

Time:05-30

The following minimal example produces the desired output, but I really want to do this without looping through the dataframes, since in reality I work with much larger dataframes an this takes quite a lot of time.

import pandas as pd

data1 = {'SCH_NUMMER': [2, 9, 8, 5], 'h': [9, 8, 7, 6]}
df1 = pd.DataFrame(data=data1)

data2 = {'HAL_HALTNU': [1, 2, 3, 4], 'HAL_VONSCH': [9, 5, 8, 2]}
df2 = pd.DataFrame(data=data2)

for n in df2['HAL_HALTNU']:
    vonsch = df2.loc[df2['HAL_HALTNU'] == n, 'HAL_VONSCH'].iloc[0]
    h1 = df1.loc[df1['SCH_NUMMER'] == vonsch, 'h'].iloc[0]
    print(f"HAL_HALTNU: {n}, h1 = {h1}")

I have two dataframes, in which two columns, namely df1['SCH_NUMMER'] and df2['HAL_VONSCH'], contain the same values, but not in the same order.
I now want to assign to every number in df1['HAL_HALTNU'] (every row in the dataframe) the corresponding value from df1['h'] in a way that for example the value h = 8, corresponding to df1['SCH_NUMMER'] == 9, is assigned to the row where df2['HAL_VONSCH'] == 9 i.e. to df2['HAL_HALTNU'] == 1. So I need the values from the colums 'SCH_NUMMER' and 'HAL_VONSCH' to 'match' in a way.

Here is the desired output

HAL_HALTNU: 1, h1 = 8
HAL_HALTNU: 2, h1 = 6
HAL_HALTNU: 3, h1 = 7
HAL_HALTNU: 4, h1 = 9

CodePudding user response:

Use merge:

df2['h1'] = df2.merge(df1, left_on='HAL_VONSCH', right_on='SCH_NUMMER', how='left')['h']
print(df2)

# Output
   HAL_HALTNU  HAL_VONSCH  h1
0           1           9   8
1           2           5   6
2           3           8   7
3           4           2   9

Or map:

df2['h1'] = df2['HAL_VONSCH'].map(df1.set_index('SCH_NUMMER')['h'])
print(df2)

# Output
   HAL_HALTNU  HAL_VONSCH  h1
0           1           9   8
1           2           5   6
2           3           8   7
3           4           2   9
  • Related