Home > Net >  Concat multiple dataframes by keeping matches from a specific column
Concat multiple dataframes by keeping matches from a specific column

Time:01-26

    df1       df2         df3         df4         df5
    Stock TF  Stock   TF  Stock   TF  Stock   TF  Stock   TF
    AKENR  D  ARMDA   3D  AGESA    W  AKBNK   2W  ANELE    M
    AKSEN  D  BMSTL   3D  AKGRT    W  AKENR   2W  ANGEN    M
    ANSGR  D  BRMEN   3D  AVGYO    W  AKSUE   2W  BANVT    M
    AVGYO  D  BTCIM   3D  AZTEK    W  ALBRK   2W  BIOEN    M
     AYEN  D   FADE   3D  BNTAS    W  BIMAS   2W  DENGE    M
    BANVT  D  ISFIN   3D  CEOEM    W  BRSAN   2W  DIRIT    M
    CONSE  D   KGYO   3D  DNISI    W   CASA   2W  DOGUB    M
    CRFSA  D  KLGYO   3D  DOKTA    W  CRFSA   2W  DYOBY    M
    DITAS  D  OZRDN   3D   FADE    W  DITAS   2W  IHGZT    M
     DOAS  D              IHLGM    W   ERCB   2W  ISGYO    M
     EKIZ  D              KARTN    W  FONET   2W   KENT    M
    ELITE  D               KGYO    W  FROTO   2W  KLNMA    M
    ETILR  D              KLGYO    W  GARAN   2W  KNFRT    M
    GENTS  D              KNFRT    W  HUNER   2W  KUVVA    M
    IHAAS  D              KZBGY    W  IHEVA   2W  MARKA    M
    INVEO  D              MERIT    W  INVEO   2W  NUGYO    M
    ISKPL  D              METRO    W  KERVN   2W  OYLUM    M
     KENT  D              MRGYO    W  KIMMR   2W  OZRDN    M
    KERVN  D              PCILT    W  KLRHO   2W  PRKME    M
    KFEIN  D              PENTA    W  KMPUR   2W  PRZMA    M
     KGYO  D              SUMAS    W  KONTR   2W  SELGD    M
    KIMMR  D                          KRONT   2W  SONME    M
    KRONT  D                           ORGE   2W  TSPOR    M
    KUVVA  D                          OZRDN   2W 
    MEDTR  D                          SNKRN   2W 
    MERKO  D                          TMPOL   2W
    MIPAZ  D                          VAKBN   2W
    NUHCM  D                          YKBNK   2W
    ORCAY  D
    OZRDN  D 
    PAGYO  D
    PAPIL  D
    PARSN  D
    PSDTC  D
    RAYSG  D
    SANKO  D
    SELGD  D
    SNGYO  D
     SOKM  D
    SUMAS  D
    SUWEN  D
    TCELL  D
    TETMT  D
    TEZOL  D
    TUCLK  D
     USAK  D
    UZERB  D
    YEOTK  D
    YYAPI  D
    ZOREN  D

I have 5 different dataframes with same column numbers(2) and names(Stock and TF). TFs are different for each df. I want to compare these dfs with eachother and merge them into a single df by exracting matching Stocks and TFs. For instance AKENR is both under df1 and df4.

I have tried to chain merge them:

dfx = df1.merge(df2, left_on='Stock', right_on='Stock', how='outer').merge(df3, on='Stock',how='left').merge(df4, on='Stock',how='left').merge(df5, on='Stock',how='left')

Also:

from functools import reduce
dfs = [df1, df2, df3, df4, df5]
dfx = reduce(lambda  left,right: pd.merge(left,right,on=['Stock'],how='inner'), dfs)

I'm expecting something like this:

enter image description here

Thanks.

CodePudding user response:

IIUC, you can try:

from functools import reduce

list_of_dfs = [df1, df2, df3, df4, df5]

df_out = reduce(lambda x, y: x.merge(y, on='Stock', how='outer'), list_of_dfs).set_index('Stock')
df_out[df_out.notna().sum(axis=1) > 1].sort_index().fillna('')

Output:

      TF_x TF_y TF_x TF_y TF
Stock                       
AKENR    D             2W   
AVGYO    D         W        
BANVT    D                 M
CRFSA    D             2W   
DITAS    D             2W   
FADE         3D    W        
INVEO    D        2W        
KENT     D              M   
KERVN    D        2W        
KGYO     D    W             
KGYO     D   3D             
KIMMR    D        2W        
KNFRT         W         M   
KRONT    D   2W             
KUVVA    D              M   
OZRDN    D   3D         M   
OZRDN    D   2W         M   
SELGD    D              M   
SUMAS    D    W     
  • Related