I've got a function which returns two values like:
def myfunc(x):
return a, b
And I want to assign a and b to two columns in my dataset, currently I use code like:
df.loc[:,'col1'] = df['col0'].apply(lambda x: myfunc(x)[0])
df.loc[:,'col2'] = df['col0'].apply(lambda x: myfunc(x)[1])
Certainly it's inefficient because it calls myfunc twice. But I don't know how to assign values to two columns using one sentence in this circumstance. Can anyone give some help pleaseeeee?
CodePudding user response:
You should convert the function output to Series:
def myfunc(x):
# dummy example
return x 1, x**2
df = pd.DataFrame({'col': [1,2,3,4]})
df[['col2', 'col3']] = df['col'].apply(lambda x: pd.Series(myfunc(x)))
NB. this kind of operation is slow, better find a way to make your function vectorial
output:
col col2 col3
0 1 2 1
1 2 3 4
2 3 4 9
3 4 5 16
Alternatively, this might be slightly more efficient as the dataframe constructor is called only once (vs one Series per row above):
df[['col2', 'col3']] = pd.DataFrame(zip(*df['col'].apply(myfunc))).T