Home > OS >  Python pandas merge/join duplicating data for new/old data input
Python pandas merge/join duplicating data for new/old data input

Time:07-12

Sorry about the title but didn't know the best way to put this :).

So on the first file, I have Old and New data. And I also have another file that has only the old data (it's the IPs configured on the network).

but I can have for the same NUMBER on the old data, multiple equipment's that change the NE_NAME, so I need to find by the NEXT_HOP_IP.

the problem is I only have the NEXT_HOP_IP match for the NE_NAME only for the old data but I need both to have the same NE_NAME.

I am working with pandas and I am trying to understand if using join or merge I can achieve something like this or if I need to use another solution?

Example:

The file with new/old data

NUMBER  route_id        route_type  NEXT_HOP_IP          
14738   14738_site_1    new         10.43.148.221    
14738   14738_site_1    old         10.43.148.217    
79470   79470_site_6    new         10.43.148.101    
79470   79470_site_6    old         10.43.148.241

The file containing old data (duplicate NE_NAME but unique NEXT_HOP_IP)

NE_NAME      NEXT_HOP_IP  
ABC14738_A   10.43.148.217
ABC14738_A   ...
ABC79470_B   10.43.148.241
ABC79470_B   ...
ABC14738_B   xx.xx.xx.xx  
ABC79470_C   xx.xx.xx.xx  

Final result

NUMBER  route_id    route_type  NEXTHOP         Name          
14738   14738_site_1    new     10.43.148.221   ABC14738_A 
14738   14738_site_1    old     10.43.148.217   ABC14738_A 
79470   79470_site_6    new     10.43.148.101   ABC79470_B
79470   79470_site_6    old     10.43.148.241   ABC79470_B

Thanks in advance for any help

CodePudding user response:

Left merge the dataframes then group the merged frame by NUMBER and transform NE_NAME with first to select the first non null value per group

df1['NAME'] = df1.merge(df2, how='left').groupby('NUMBER')['NE_NAME'].transform('first')

   NUMBER      route_id route_type    NEXT_HOP_IP        NAME
0   14738  14738_site_1        new  10.43.148.221  ABC14738_A
1   14738  14738_site_1        old  10.43.148.217  ABC14738_A
2   79470  79470_site_6        new  10.43.148.101  ABC79470_B
3   79470  79470_site_6        old  10.43.148.241  ABC79470_B
  • Related