Home > Enterprise >  pandas select dataframe rows according to pair condition
pandas select dataframe rows according to pair condition

Time:08-03

I have a dataframe like following:

df = pd.DataFrame({
    'contract':[11, 11, 11, 12, 12, 13, 13, 13, 13], 
    'num_date':[1, 2, 3, 1, 2, 1, 2, 3, 4],
    'val': [100, 110, 120, 50, 150, 95, 105, 115, 125]
})

Pairs (contracts, num_date) are unique

Also I have the list of tuples:

filt = [(11, 1),
        (11, 1),
        (12, 3),
        (12, 2),
        (13, 3),
        (13, 1)]

Each tuple in list is a pair (contract, num_date) for dataframe above.

I need choose rows from dataframe according to this list as a condition, and if the tuple is repeated several times then the row corresponding to it also must be repeated the same number of times.

For example for dataframe and list of tuples above solution is

contract num_date val
11 1 100
11 1 100
12 2 150
13 3 115
13 1 95

Will be better if the solution will be time-efficient

CodePudding user response:

First you need to convert the tuples into a dataframe:

df2 = pd.DataFrame(filt)
df2.columns = ['contract', 'num_date']

Then merge between the two and drop nulls:

df2 = df2.merge(right=df, on = ['contract', 'num_date'], how='left')
df2.dropna(subset=['val'], inplace=True)
  • Related