Home > database >  Compare two dataframes and return common values
Compare two dataframes and return common values

Time:03-14

I have 2 data frame , Need to fetch system_type column based on "Name" column of both dataframe .

I have 500000 rows of df1 as format

Name  Timestamp       usage  
AXCS   2022-01-01     5  
BGXD   2022-02-01     70  
HFSD   2022-03-01     45  
AEVC   2022-01-01     25  
BHRF   2022-02-01     12

and 550000 rows of df2 as

Name  System_Type  
HFSD      Dev  
BHRF      Test  
BGXD      Prod  
AEVC      Prod  
AXCS      Test

I used the following coding

pd.merge(df1, df2, on="Name")

It is taking much time to process , Is there another way/method to process it . Please advise

CodePudding user response:

You can do like this :

import pandas as pd

df1 = pd.DataFrame({
    'System_Name':['AXCS','BGXD','HFSD','AEVC', 'BHRF'],
    'Timestamp':['2022-01-01','2022-02-01','2022-03-01','2022-01-01', '2022-01-01'],
    'usuage ':[5,70,45,25,12],
    })

df2 = pd.DataFrame({
    'System_Name':['HFSD','BHRF','BGXD','AEVC', 'AXCS'],
    'System_Type':['Dev','Test','Prod','Prod', 'Test'],
    })

# Get all diferent values
df3 = pd.merge(df1, df2, how='outer', indicator='Exist')
df3 = df3.loc[df3['Exist'] == 'both']
# If you like to filter by a System_Name
df3  = pd.merge(df1, df2, on="System_Name", how='outer', indicator='Exist')
df3  = df3.loc[df3['Exist'] == 'both']
print(df3)

#output :

  System_Name   Timestamp  usuage  System_Type Exist
0        AXCS  2022-01-01        5        Test  both
1        BGXD  2022-02-01       70        Prod  both
2        HFSD  2022-03-01       45         Dev  both
3        AEVC  2022-01-01       25        Prod  both
4        BHRF  2022-01-01       12        Test  both

CodePudding user response:

You can use df2 as a dict mapping:

df1['System_Type'] = df1['Name'].map(df2.set_index('Name')['System_Type'])
print(df1)

# Output
   Name   Timestamp  usage System_Type
0  AXCS  2022-01-01      5        Test
1  BGXD  2022-02-01     70        Prod
2  HFSD  2022-03-01     45         Dev
3  AEVC  2022-01-01     25        Prod
4  BHRF  2022-02-01     12        Test
  • Related