Home > Software engineering >  Filter pandas dataframe inside of function and return filtered dataframe
Filter pandas dataframe inside of function and return filtered dataframe

Time:10-21

I have a class with online and offline data and i want to use a function which can filter the data according to existing columns. The following code works:

class Exp:
def __init__(self):
    d = {'col1': [1, 2], 'col2': [3, 4], 'col3' : [5,6]}
    d2 = {'col1': [7, 8], 'col2': [9, 10], 'col3' : [11,12]}
    
    self.online = pd.DataFrame(d)
    self.offline = pd.DataFrame(d2)



def filter_cols(self, typ, cols):

    if typ == "on":
        self.online = self.online.filter(items= cols)
    if typ == "off":
        self.offline = self.offline.filter(items = cols)


print("before")
a = Exp()
print(a.online)
a.filter_cols("on", cols = ['col1', 'col2'])
print("after")
print(a.online)

outcome:

    before
   col1  col2  col3
0     1     3     5
1     2     4     6
after
   col1  col2
0     1     3
1     2     4

But i want to make my function more genereic so that i dont have to specify the if and else statements in the function. So basically i want something like the following to work

class Exp:
.
.
.

    def filter_cols(self, data, cols):

        data.filter(items= cols)
        


print("before")
a = Exp()
print(a.online)
a.filter_cols(a.online, cols = ['col1', 'col2'])
print("after")
print(a.online)

But the outcome remains the same:

before
   col1  col2  col3
0     1     3     5
1     2     4     6
after
   col1  col2  col3
0     1     3     5
1     2     4     6

CodePudding user response:

Use setattr and getattr like this:

    def filter_cols(self, typ, cols):
        setattr(self, typ   'line', getattr(self, typ   'line').filter(items= cols))

Full code:

class Exp:
    def __init__(self):
        d = {'col1': [1, 2], 'col2': [3, 4], 'col3' : [5,6]}
        d2 = {'col1': [7, 8], 'col2': [9, 10], 'col3' : [11,12]}
        
        self.online = pd.DataFrame(d)
        self.offline = pd.DataFrame(d2)



    def filter_cols(self, typ, cols):
        setattr(self, typ   'line', getattr(self, typ   'line').filter(items= cols))


print("before")
a = Exp()
print(a.online)
a.filter_cols("on", cols = ['col1', 'col2'])
print("after")

Output:

before
   col1  col2  col3
0     1     3     5
1     2     4     6
after
   col1  col2
0     1     3
1     2     4
  • Related