Let's say we have a dataframe df
:
column1 column2 column3
0 A1 1 2
1 A2 2 3
2 A3 3 4
I'd like to make a function that does filtering(s). However, I don't know how many columns I'll be filtering. Depending on the dataset, I could use two columns or just one. For instance, I want to keep column2
that's greater or equal than 2 i.e., df[df.column2 >= 2]
. However, at different times, I want to do two filterings:
keep column2
that's greater or equal than 2 i.e., df[df.column2 >= 2]
AND
keep column3
that's greater or equal than 4 i.e., df[df.column3 >= 4]
.
How does one capture all this? I think the col
parameter should have an asterisk and a num
paramter that specifies the number to filter. However, I don't know how to set an inequality as a parameter.
def select_filters(*col, num):
return df.col
CodePudding user response:
There are two methods to approach this kind of problem, the first is to use a list as your parameter as such:
def select_filters(col, num):
for i in range(num):
#Do whatever your filtering is here
return col[num]
select_filters([df.column1,df.column2,..],2)
Another option for a variable number of parameters would look like this:
def select_filters(num,*col):
#Do whatever your filtering is here
However, you must ensure that the variable col comes AFTER num, and not before. I am unsure about your exact filtering, but this should be (hopefully) enough to solve your issue if I understood correctly.