Home > Enterprise >  How to match based on an range in pandas
How to match based on an range in pandas

Time:11-10

I have the following table for example

              Name   Language   SalaryBegin    SalaryEnd
              Tom      C =         20            30
              Jeff     C--         21            32
              Hideo    JAVA        22             29
              TomoHiro RATCHET     19              20

              Name     Language       SalaryBegin SalaryEnd
              Tom      python        26            27
              jeff      python       22            23
               Hideo     JAVA     23              26

My question is how can you use pandas to match salary begin and salary end based on an interval. For example for Hideo (23,26) is inside (22,29) and also match by language.

I did t

table4=pd.merge(table3,table1,how="inner",on=["Language","SalaryBegin","SalaryEnd"])
#this is does not work

CodePudding user response:

Use df.merge with df.query:

In [284]: df1
Out[284]: 
       Name Language  SalaryBegin  SalaryEnd
0       Tom      C =           20         30
1      Jeff      C--           21         32
2     Hideo     JAVA           22         29
3  TomoHiro  RATCHET           19         20

In [286]: df2
Out[286]: 
    Name Language  SalaryBegin  SalaryEnd
0    Tom   python           26         27
1   jeff   python           22         23
2  Hideo     JAVA           23         26

In [291]: result = df1.merge(df2, on=['Name', 'Language']).query('SalaryBegin_y >= SalaryBegin_x & SalaryEnd_y <= SalaryEnd_x')[['Name', 'Language']]

In [292]: result
Out[292]: 
    Name Language
0  Hideo     JAVA
  • Related