Home > Mobile >  Using IF, ELSE conditions to iterate through a pandas dataframe, then obtaining and using data from
Using IF, ELSE conditions to iterate through a pandas dataframe, then obtaining and using data from

Time:10-06

Here is my dataframe:

df = pd.DataFrame({'First Name': ['George', 'Alex', 'Leo'], 
                   'Surname' : ['Davis', 'Mulan', 'Carlitos'],
                   'Age': [10, 15, 20],
                   'Size' : [30,  40, 50]})

Output:

First Name   Surname  Age  Size
0     George     Davis   10    30
1       Alex     Mulan   15    40
2        Leo  Carlitos   20    50

And here is a function:

def myfunc(firstname, surname):
    print(firstname   ' '   surname)

Now, I would like to iterate through the dataframe and check for the following conditions:

  • IF df['age'] > than 11 AND df['size'] < 51

If there is a match (row 2 and row 3), I would like to call 'myfunc' and pass in the data from the applicable rows in the dataframe as the attributes of myfunc.

'myfunc()' would be called as:

myfunc(df[First Name], df[Surname])

So in this example the output after running the code would be:

'Alex Mulan'
'Leo Carlitos'

(The IF, AND condition was true in the second and third row.)

Please explain how could I achieve this goal and provide a working code snippet.

I would prefer a solution where no additional column is created. (If the solution remains practical. Otherwise a new column can be created and added to the dataframe if needed.)

CodePudding user response:

Use .loc to filter your dataframe and apply your function. Use a lambda function as a proxy to call your function with the right signature.

def myfunc(firstname, surname):
    return firstname   ' '   surname

out = df.loc[df['Age'].gt(11) & df['Size'].lt(51), ['First Name', 'Surname']] \
        .apply(lambda x: myfunc(*x), axis=1)

Output:

>>> out
1      Alex Mulan
2    Leo Carlitos
dtype: object

>>> type(out)
pandas.core.series.Series

CodePudding user response:

Try with agg then

df.loc[(df.Age>11) & (df.Size<51),['First Name','Surname']].agg(' '.join,1)
Out[124]: 
1      Alex Mulan
2    Leo Carlitos
dtype: object

CodePudding user response:

First select the required records using indexing, then concatenate the names:

selected = df[(df['Age'] > 11) & (df['Size'] < 51)]
print(selected['First Name']   " "   selected['Surname'])

Edit: to pass each row to a generic function and ensure the right columns are passed, you can write a helper like this:

def apply(df, func, kwargs):
    return df[kwargs].rename(columns=kwargs) \
        .apply(lambda row: func(**row), axis=1)

print(apply(df=selected,
    func=myfunc,
    kwargs={"First Name": "firstname", "Surname": "surname"}))

There is often a more efficient solution that passes while columns to a function instead of applying it row by row.

  • Related