Home > Enterprise >  Join/ Merge 2 Comma Separated Text Files
Join/ Merge 2 Comma Separated Text Files

Time:07-17

I would really appreciate assistance in developing Python/ Pandas code to left-join 2 separate CSV files. I'm very new to Python so unclear how to begin. I have used Excel plugins to achieve same but it would normally hang or take hours to complete due to the huge amount of records being processed.

Scenario: joining on the first column.

CSV1

AN  DNM OCD TRI
1   343 5656    90
2   344 5657    91
3   345 5658    92
4   346 5659    93

CSV2

AN2 STATE   PLAN
4   3   19
3   2   35
7   3   19
8   3   19

Result inclusive of a match status if possible:

AN  DNM OCD TRI STATE   PLAN    Join Status
1   343 5656    90          No_match
2   344 5657    91          No_match
3   345 5658    92  2   35  Match
4   346 5659    93  3   19  Match

All help appreciated.

CodePudding user response:

You can use .merge with indicator= parameter:

out = df1.merge(
    df2, left_on="AN", right_on="AN2", indicator="Join Status", how="left"
)
out = out.drop(columns="AN2")
out["Join Status"] = out["Join Status"].map(
    {"left_only": "No_match", "both": "Match"}
)
print(out)

Prints:

   AN  DNM   OCD  TRI  STATE  PLAN Join Status
0   1  343  5656   90    NaN   NaN    No_match
1   2  344  5657   91    NaN   NaN    No_match
2   3  345  5658   92    2.0  35.0       Match
3   4  346  5659   93    3.0  19.0       Match

CodePudding user response:

Let's assume you have df1 and df2 and you want to merge the two dataframe

df = df1.merge(df2, how='left', left_on='AN', right_on='AN2')

I hope this will help you

  • Related