Home > Software design >  How to do a lookup based on table's row value in Python?
How to do a lookup based on table's row value in Python?

Time:07-16

Say I have two tables: Table A:

   state  value
0     A     100

Table B:

   state  100  200
0      A    1    4
1      B    2    5
2      C    3    6

I want to create a new field for Table A called "Factor" that returns the respective value from Table B:

   state  value  factor
0     A     100       1

How would I do this in Python/Pandas?

In Excel, I would do: INDEX('Table B'!B2:C4, MATCH('Table A'!A2, 'Table B'!A:A, 0), MATCH('Table A'!B2, 'Table B'!B1:C1, 0))

CodePudding user response:

You should try to use loc for this. Please try the following example:

Assuming Table A (dfA) and Table B (dfB) are both pandas dataframes:

>>> A = {
    'state' : ['A'],
    'value' : [100],
}

>>> B = {
    'state' : ['A','B','C'],
    '100' : [1,2,3],
    '200' : [4,5,6]
}

>>> dfA = pd.DataFrame(A)
>>> dfB = pd.DataFrame(B)

which gives you your tables as pandas dataframes

>>> dfA
state   value
0   A   100

>>> dfB
state   100 200
0   A   1   4
1   B   2   5
2   C   3   6

Then we search a specific column in the dataframe for a specific value and return the value of a different column from that same row.

extracted_value = list(dfB.loc[dfB['state'] == 'A', '100'])[0]

We create a column and set that value for that column dfA['factor'] = extracted_value

and we now have that extracted value in the appropriate column

>>> dfA
state   value   factor
0   A   100      1

Now I'm sure you want to do this in a loop to address a particular list of values and I hope this helps you start. Additionally, using dfA['factor'] = extracted_value is going to set every value in that column to extracted_value and you might want to have different values per row. To accomplish that you're going to want to use .loc again to add values based on the row index.

CodePudding user response:

  1. Pivot dfA from wide to long format using melt. Ensure the column names get converted to a numeric type.
  2. merge the longform data to dfA
melted = dfB.melt(id_vars=['state'], var_name='value', value_name='factor')
melted['value'] = melted['value'].astype(int)
dfA = dfA.merge(melted, on=['state', 'value'])

Result:

  state  value  factor
0     A    100       1

This maybe feels like overkill for this example, but could be helpful for larger lookups.

  • Related