Home > Software engineering >  Function to filter dataframe with multiple if / and conditions
Function to filter dataframe with multiple if / and conditions

Time:12-13

Build a function that looks at a dataframe and a variable "infolist" and returns a set of clients which are in both the infolist and dataframe, and meet at least one statement:

  1. Any client in infolist with membership fee larger/equal to the current "Offer-Price"

  2. Any client in infolist who "Feedback" is 0.2 (or higher)

Example dataframe:

df = pd.DataFrame({'Client': ['A','B','C'], 'Offer-Price':[90,6591,8000], 'Feedback': [0.1, 0.2,0.1]})

   Client     Offer-Price   Feedback
0      A              90         0.1
1      B            6591         0.2
2      C            8000         0.1

'infolist'/tuple input:

infolist = [('A', 100.0), ('C', 3900.0)]

Expected outcome:

Client C

Code tried so far: Not sure why this isn't working after tons of research.

def testfunction(df, infolist):
    df = np.where((df['Client']== infolist[x[0]]) & (df['Feedback']>= 0.2) | (infolist[x[1]] >= df['Offer-Price']))
    return df

CodePudding user response:

Create a dataframe from infolist then merge it to df and filter out your data:

out = pd.DataFrame(infolist, columns=['Client', 'Offer-Price']).merge(df, on='Client')
mask = (out['Offer-Price_x'] < out['Offer-Price_y']) | (out['Feedback'] >= 0.2)
out = out.loc[mask, 'Client']
print(out)

# Output:
1    C
Name: Client, dtype: object

Update

Ideally I need this to be inside a function, so that if the tuple values change, I can just rerun

def func(df, infolist):
    out = pd.DataFrame(infolist, columns=['Client', 'Offer-Price']).merge(df, on='Client')
    mask = (out['Offer-Price_x'] < out['Offer-Price_y']) | (out['Feedback'] >= 0.2)
    return out.loc[mask, 'Client'].tolist()

clients = func(df, infolist)
print(clients)

# Output:
['C']
  • Related