How can I use Conditional Logic to set two variables in a python function and then use the function-CodePudding

If I had a function to set a variable based on some columns in a dataframe I can do it with conditional syntax and create the variable val. I can then create a new column using the apply function in pandas by applying this function to the dataframe df.

My question is am I able to use the conditional statement in the function to set 2 variables (val,val2) and then create two new columns 'fruit' and 'number' in Pandas using an apply function?

I can easily just create two functions and set val 1 and val 2 in each function and then have two separate apply functions to create the two columns, but I was wondering if I was able to do it in one function that sets both variables to reduce having to duplicate the code and one apply function?

Please let me know if this doesn't make sense.

Cheers

import pandas as pd

list_of_classes =[
['cit','lemon'],
['cit','lemon'],
['cit','lemon'],
['cit','lemon'], 
['watermelon','lemon'],
['watermelon','lemon'],
['lime','water'],    
['lime','water']
]

df = pd.DataFrame(list_of_classes,columns = ['class','subclass'])

print(df)

def create_variables(row):
    if row['class'] == 'Lime':
        val = 'Lime'
        val2 = 1
        
    elif row['sub_class'] == 'lemon':
        val = 'lemon'
        val2 = 1.1
        
    elif row['sub_class'] == 'orange':
        val = 'orange'
        val2 = 1.2
        
    elif row['class'] == 'Apple':
        val = 'Apple'
        val2 = 2
               
    else:
        val = 'exception'
        val2 = 'exception'
        
    return val, val2

#create new column in dataframe 'df' for fruit and number

df[['fruit', 'number']] = df.apply(create_variables, axis=1)

The output I'm after with the two new columns would look something like below:

CodePudding user response：

Refer to the result_type parameter of the apply function. Specifically, you want to use "expand" here:

‘expand’ : list-like results will be turned into columns.

Edit: Also the description of the behavior of the default None implies that if you modify your function to return a pd.Series then it will expand it into columns, e.g.:

return pd.Series([val,val2])