I want to do an inner join (merge) with df1
:
mandant fk_lart_nr fk_ust_sch
7 55 43
00 55 43
and df2
:
mandant ust_sch ust_bez
0 59 Gutschrift 19 %
18 3 Rechnung 0 %
I want to use two join conditions. The first one is always fixed.
first join condition is fixed df1.fk_ust_sch = df2.ust_sch
However, the second join condition should be used depending on what is the value of df1.mandant.
- if df1.mandant is either 7 or 18 ( .isin([7,18]) ), the second join condition should be this:
df1.mandant = df2.mandant
- If df1.mandant is not 7 or 18, the second join condition should be this using df2:
df2.mandant == '00'
df_merged = df1.merge(df2, left_on=['fk_ust_sch'],
right_on=['ust_sch'],
how='inner', suffixes=('_df1', '_df2'))
CodePudding user response:
You can replace values of df2.mandant
to 00
if not match condition .isin([7,18])
an merge by both columns:
df_merged = (df1.merge(df2.assign(mandant = df2.mandant.where(df2.mandant.isin([7,18]), '00')),
left_on=['fk_ust_sch','mandant'],
right_on=['ust_sch','mandant'],
how='inner', suffixes=('_df1', '_df2')))
CodePudding user response:
Since you cannot specify join condition during .merge()
, what you could do is to filter the merge result with your required conditions, as follows:
Case 1: If your mandant
fields are of integer type:
# Reuse your codes for merge
df_merged = df1.merge(df2, left_on='fk_ust_sch',
right_on='ust_sch',
how='inner', suffixes=('_df1', '_df2'))
# Filter
m1 = df_merged.mandant_df1.isin([7,18]) & (df_merged.mandant_df1 == df_merged.mandant_df2)
m2 = ~df_merged.mandant_df1.isin([7,18]) & (df_merged.mandant_df2 == 0) ## use 0 instead of '00'
df_merged_filtered = df_merged.loc[m1 | m2]
Case 2: If your mandant
fields are of string type:
# Reuse your codes for merge
df_merged = df1.merge(df2, left_on='fk_ust_sch',
right_on='ust_sch',
how='inner', suffixes=('_df1', '_df2'))
# Filter
### Put 7, 18 in quotes as '7', '18'
m1 = df_merged.mandant_df1.isin(['7','18']) & (df_merged.mandant_df1 == df_merged.mandant_df2)
m2 = ~df_merged.mandant_df1.isin(['7','18']) & (df_merged.mandant_df2 == '00')
df_merged_filtered = df_merged.loc[m1 | m2]
Demo
Enriched your sample data in order to include all required test cases:
(Demo using case 1 with your mandant
fields as integer type)
Input Data
df1
mandant fk_lart_nr fk_ust_sch
0 7 55 43
1 0 55 43
2 10 55 43
df2
mandant ust_sch ust_bez
0 0 59 Gutschrift 19 %
1 6 43 Gutschrift 21 %
2 7 43 Gutschrift 30 %
3 0 43 Gutschrift 50 %
4 18 3 Rechnung 0 %
Result:
print(df_merged_filtered)
mandant_df1 fk_lart_nr fk_ust_sch mandant_df2 ust_sch ust_bez
1 7 55 43 7 43 Gutschrift 30 %
5 0 55 43 0 43 Gutschrift 50 %
8 10 55 43 0 43 Gutschrift 50 %