Home > Enterprise >  Pandas DataFrame Cross-Reference/Selective Join/Overlay?
Pandas DataFrame Cross-Reference/Selective Join/Overlay?

Time:02-18

What is the best/fastest way to do a selective lookup/cross-reference/overlay/partial join between two Pandas DataFrames? I'm not sure of the right terminology to use....

Given:

  1. A primary table filled with numerical values and some arbitrary lookup/reference strings, indexed by date/month
  2. A secondary table, with a consistent index to the first. Columns of this dataframe match with the arbitrary lookup/reference strings in the primary table which can change over time, and serve as the lookup column to replace values in the primary table.

The indices of the two tables don't necessarily need to be exactly the same set (or even contiguous), but if a lookup is present in Table 1 it obviously needs a corresponding row in Table 2.

i.e. I'd like to find every lookup/placeholder in Table 1 and replace with the correct lookup value in Table 2 from the correct month.

Example below:

Table 1

dte value1 value2
2022-01 1 2
2022-02 3 LOOKUP1
2022-03 LOOKUP3 4
2022-04 5 6

Table 2

dte LOOKUP1 LOOKUP2 LOOKUP3
2021-12 101 105 109
2022-02 102 106 110
2022-03 103 107 111
2022-10 104 108 112

Result

dte value1 value2
2022-01 1 2
2022-02 3 102
2022-03 111 4
2022-04 5 6

I need to do a pile of these in my use case, so was hoping for something at least reasonably fast. I'm not an expert with Pandas, wondering if there is a smart way to do this. I could loop through every row/column in Table 1, look for values that match column names in Table 2 and then do a lookup in Table 2 replacing value in Table 1. There seems to often be a smart way to do this kind of thing in Pandas where it automatically paralellizes for you. Any help would be appreciated!

CodePudding user response:

One easy way is using replace , not sure about the speed

out = tb1.set_index('dte').T.replace(tb2.set_index('dte').T).T
Out[172]: 
        value1 value2
dte                  
2022-01      1      2
2022-02      3    102
2022-03    111      4
2022-04      5      6
  • Related