Home > Blockchain >  Look up data from other table with apply
Look up data from other table with apply

Time:10-14

Let's say I have a table that looks like this:

table = pd.DataFrame(
          columns=["Name", "Size", "Color"], 
          data=[['A', 1, 'Red'], ['B', 2, 'Green'], ['C', 3, 'Blue']]
        )

And a lookup table that looks like this:

lookup = pd.DataFrame(
           columns=["Color", "Source", "Lead Days"], 
           data=[["Red", "Europe", 2], 
                 ["Green", "Europe", 3], 
                 ["Blue", "US", 1], 
                 ["Yellow", "Europe", 2]]
         )

How might I add columns "Source" and "Lead Days" to table by looking up "Color" from lookup?

A story sometimes helps.

table has all the items I need to order.

lookup has where I order them from and how long it takes.

I want to transform table so that it can show the "Source" and "Lead Days" for each item I need to order.

Here's what the final table should look like:

Note: while I'm sure there's a way to do this with merge, or top level table operations. In the spirit of Minimally Sufficient Pandas, and to avoid the huge kluge that is pandas' over-provisioning of operations, I'd prefer to do it with apply. Apply is nice because it's easy to consistently reach for apply in all situations.

Here's my current approach, but it results in the error ValueError: Columns must be same length as key

To me, this makes little sense, since I'm returning a list of length 2 and putting it into two columns. But I'm sure pandas has its reasons for being anti-intuitive here.

lookup_columns = ["Source", "Lead Days"]
table[lookup_columns] = table.apply(
    lambda row: 
          lookup.query('`Color` == "{color}"'.format(color=row["Color"])).loc[:, lookup_columns].values[0]
      , axis = 1)

CodePudding user response:

Use result_type="expand":

lookup_columns = ["Source", "Lead Days"]
table[lookup_columns] = table.apply(
    lambda row:
    lookup.query('`Color` == "{color}"'.format(color=row["Color"])).loc[:, lookup_columns].values[0]
    , axis=1, result_type="expand")

print(table)

  Name  Size  Color  Source  Lead Days
0    A     1    Red  Europe          2
1    B     2  Green  Europe          3
2    C     3   Blue      US          1

From the documentation (emphasis mine):

result_type {‘expand’, ‘reduce’, ‘broadcast’, None}, default None
These only act when axis=1 (columns):
‘expand’ : list-like results will be turned into columns.

CodePudding user response:

With apply, you can do:

>>> pd.concat([table, table['Color'].apply(lambda x: lookup.loc[lookup['Color'] == x, ['Source', 'Lead Days']].squeeze())], axis=1)

  Name  Size  Color  Source  Lead Days
0    A     1    Red  Europe          2
1    B     2  Green  Europe          3
2    C     3   Blue      US          1

Old answer

Use pd.merge:

>>> pd.merge(table, lookup, how='left', on='Color')

  Name  Size  Color  Source  Lead Days
0    A     1    Red  Europe          2
1    B     2  Green  Europe          3
2    C     3   Blue      US          1

CodePudding user response:

I wouldnt' use apply. It is computationally expensive. For your case, join will suffice.

table.set_index('Color').join(lookup.set_index('Color')).reset_index()



    Color Name  Size  Source  Lead Days
0    Red    A     1  Europe          2
1  Green    B     2  Europe          3
2   Blue    C     3      US          1
  • Related