Home > front end >  Need to Map two dataframe in Pandas -Python
Need to Map two dataframe in Pandas -Python

Time:05-10

I have two dataframes, and I need to map different columns in both the dataframe. sharing both dataframe.

Dataframe:-

   IN.Type     STRA     STRB   STRC
   CRBD        2487     XR     XL0054
   DFRS        3754     MY     XL0684
   CRBD        7356     DF     XL8911
   DFRS        4487     DF     XL58999
   DFRS        7785     MY     XL76568
   CRBD        8235     GL     XL0635
   DFRS        2468     PQ     XL4569
   DFRS        9735     GR     XL7589
   CRBD        6486     TY     XL5566 
   DFRS        1023     PQ     XL27952

Dataframe2 :-

        REF1     REF2 
        2468     PQ
        3754     MY
        1023     PQ
        9735     NA
        2487     XR
        7785     MY
        7356     DF
        8235     NA
        6486     TY
        4487     NA

What Needed :-

  1. Firstly need to search Dataframe2["REF2"] in Dataframe["STRB"] and if available then pick the corresponding data available in Dataframe["IN.Type"] and Dataframe["STRC"] for output dataframe.
  2. Second if Dataframe2["REF2"] == "NA" then need to search Dataframe2["REF1"] in Dataframe["STRA"] and if available then pick the corresponding data available in Dataframe["IN.Type"] and Dataframe["STRC"] for output dataframe.
  3. Third if Dataframe2["REF2"] not in Dataframe["STRB"] then directly put "NONE" in output dataframe "Type" column.

Output dataframe (Sample) :-

        Type     RTFN
        CRBD     XL0054
        DFRS     XL76568
        NONE     XL0635
  1. Data fetched from Dataframe["IN.Type"] will go in Output["Type"].
  2. Data fetched from Dataframe["STRC"] will go in Output["RFTN"].

CodePudding user response:

Use Series.isin for filter both DataFrames and set values in DataFrame.loc:

m1 =  Dataframe["STRB"].isin(Dataframe2["REF2"])
m2 = Dataframe["STRA"].isin(Dataframe2.loc[Dataframe2["REF2"].isna(), "REF1"])
#if NA s string
#m2 = Dataframe["STRA"].isin(Dataframe2.loc[Dataframe2["REF2"].eq('NA'), "REF1"])
df = pd.concat([Dataframe.loc[(m1 | m2), ['IN.Type','STRC']].drop_duplicates('IN.Type'), 
                Dataframe.loc[~m1, ['IN.Type','STRC']].assign(**{'IN.Type':None})])
df.columns=['Type','RTFN']
print (df)
   Type    RTFN
0  CRBD  XL0054
1  DFRS  XL0684
5  None  XL0635
7  None  XL7589
  • Related