Home > Enterprise >  Join two tables in pandas with some conditions in columns
Join two tables in pandas with some conditions in columns

Time:09-22

Num Algo Distance Result
525 M 25 Good
524 M 28 Good
523 M 30 Good
522 M 75 Good
Num Algo Distance Result
525 T 25 Good
524 T 28 Bad
520 T 98 Good
df_1 = pd.DataFrame({'Num' : [525, 524, 523, 522], 'Algo' : [M, M, M, M], 'Distance' : [25, 28, 30, 75], 'Result' : ['Good', 'Good', 'Good', 'Good']})

df_2 = pd.DataFrame({'Num' : [525, 524, 520], 'Algo' : [T, T, T], 'Distance' : [25, 28, 98], 'Result' : ['Good', 'Bad', 'Good']})

I have two dataframes as above and I want to join/merge them in the way as below (I tried different joins in pandas but it does not work as I want to):

Num Algo Distance Result
525 M, T 25 Good
524 M, T 28 Good
523 M 30 Good
522 M 75 Good
520 T 98 Good

Also, I want to give priority to the df_1['Result'] while joining as it can be seen I used 'Good' for 'Num' = 524.

CodePudding user response:

IIUC, you can use pandas.DataFrame.groupby after concat:

df = pd.concat([df_1, df_2])
new_df = df.groupby(["Num", "Distance"], 
                    as_index=False, 
                    sort=False).agg({"Algo" : ", ".join, 
                                     "Result" : "first"})

Output:

   Num  Distance  Algo Result
0  525        25  M, T   Good
1  524        28  M, T   Good
2  523        30     M   Good
3  522        75     M   Good
4  520        98     T   Good

CodePudding user response:

Try with merge:

output = df_1.merge(df_2, on=["Num", "Distance"], how="outer")

#concat Algo columns from both dfs to a string
output["Algo"] = output["Algo_x"].fillna("").str.cat(output["Algo_y"].fillna(""), sep=", ").str.strip().str.strip(",")

#combine Result column using df_2 data only when df_1 is NaN
output["Result"] = output[["Result_x", "Result_y"]].ffill(axis=0)["Result_x"]
output = output[["Num", "Algo", "Distance", "Result"]]

>>> output
   Num  Algo  Distance Result
0  525  M, T        25   Good
1  524  M, T        28   Good
2  523     M        30   Good
3  522     M        75   Good
4  520     T        98   Good
  • Related