Home > Mobile >  change join condition depending on row value
change join condition depending on row value

Time:10-08

I want to do an inner join (merge) with df1:

mandant fk_lart_nr  fk_ust_sch
7   55  43
00  55  43

and df2:

mandant ust_sch ust_bez
0   59  Gutschrift 19 %                                                                           
18  3   Rechnung 0 %                                                                              

I want to use two join conditions. The first one is always fixed.

first join condition is fixed df1.fk_ust_sch = df2.ust_sch

However, the second join condition should be used depending on what is the value of df1.mandant.

  1. if df1.mandant is either 7 or 18 ( .isin([7,18]) ), the second join condition should be this:

df1.mandant = df2.mandant

  1. If df1.mandant is not 7 or 18, the second join condition should be this using df2: df2.mandant == '00'
df_merged = df1.merge(df2, left_on=['fk_ust_sch'],
                                        right_on=['ust_sch'],
                                        how='inner', suffixes=('_df1', '_df2'))

CodePudding user response:

You can replace values of df2.mandant to 00 if not match condition .isin([7,18]) an merge by both columns:

df_merged = (df1.merge(df2.assign(mandant = df2.mandant.where(df2.mandant.isin([7,18]), '00')),
                       left_on=['fk_ust_sch','mandant'],
                       right_on=['ust_sch','mandant'],
                       how='inner', suffixes=('_df1', '_df2')))

CodePudding user response:

Since you cannot specify join condition during .merge(), what you could do is to filter the merge result with your required conditions, as follows:

Case 1: If your mandant fields are of integer type:

# Reuse your codes for merge
df_merged = df1.merge(df2, left_on='fk_ust_sch',
                           right_on='ust_sch',
                           how='inner', suffixes=('_df1', '_df2'))

# Filter
m1 = df_merged.mandant_df1.isin([7,18]) & (df_merged.mandant_df1 == df_merged.mandant_df2)
m2 = ~df_merged.mandant_df1.isin([7,18]) & (df_merged.mandant_df2 == 0)  ## use 0 instead of '00'

df_merged_filtered = df_merged.loc[m1 | m2]

Case 2: If your mandant fields are of string type:

# Reuse your codes for merge
df_merged = df1.merge(df2, left_on='fk_ust_sch',
                           right_on='ust_sch',
                           how='inner', suffixes=('_df1', '_df2'))

# Filter
### Put 7, 18 in quotes as '7', '18'
m1 = df_merged.mandant_df1.isin(['7','18']) & (df_merged.mandant_df1 == df_merged.mandant_df2)
m2 = ~df_merged.mandant_df1.isin(['7','18']) & (df_merged.mandant_df2 == '00')

df_merged_filtered = df_merged.loc[m1 | m2]

Demo

Enriched your sample data in order to include all required test cases:

(Demo using case 1 with your mandant fields as integer type)

Input Data

df1

   mandant  fk_lart_nr  fk_ust_sch
0        7          55          43
1        0          55          43
2       10          55          43


df2

   mandant  ust_sch          ust_bez
0        0       59  Gutschrift 19 %
1        6       43  Gutschrift 21 %
2        7       43  Gutschrift 30 %
3        0       43  Gutschrift 50 %
4       18        3     Rechnung 0 %

Result:

print(df_merged_filtered)

   mandant_df1  fk_lart_nr  fk_ust_sch  mandant_df2  ust_sch          ust_bez
1            7          55          43            7       43  Gutschrift 30 %
5            0          55          43            0       43  Gutschrift 50 %
8           10          55          43            0       43  Gutschrift 50 %
  • Related