If I had a function to set a variable based on some columns in a dataframe I can do it with conditional syntax and create the variable val. I can then create a new column using the apply function in pandas by applying this function to the dataframe df.
My question is am I able to use the conditional statement in the function to set 2 variables (val,val2) and then create two new columns 'fruit' and 'number' in Pandas using an apply function?
I can easily just create two functions and set val 1 and val 2 in each function and then have two separate apply functions to create the two columns, but I was wondering if I was able to do it in one function that sets both variables to reduce having to duplicate the code and one apply function?
Please let me know if this doesn't make sense.
Cheers
import pandas as pd
list_of_classes =[
['cit','lemon'],
['cit','lemon'],
['cit','lemon'],
['cit','lemon'],
['watermelon','lemon'],
['watermelon','lemon'],
['lime','water'],
['lime','water']
]
df = pd.DataFrame(list_of_classes,columns = ['class','subclass'])
print(df)
def create_variables(row):
if row['class'] == 'Lime':
val = 'Lime'
val2 = 1
elif row['sub_class'] == 'lemon':
val = 'lemon'
val2 = 1.1
elif row['sub_class'] == 'orange':
val = 'orange'
val2 = 1.2
elif row['class'] == 'Apple':
val = 'Apple'
val2 = 2
else:
val = 'exception'
val2 = 'exception'
return val, val2
#create new column in dataframe 'df' for fruit and number
df[['fruit', 'number']] = df.apply(create_variables, axis=1)
The output I'm after with the two new columns would look something like below:
CodePudding user response:
Refer to the result_type
parameter of the apply function. Specifically, you want to use "expand" here:
‘expand’ : list-like results will be turned into columns.
Edit: Also the description of the behavior of the default None
implies that if you modify your function to return a pd.Series
then it will expand it into columns, e.g.:
return pd.Series([val,val2])