I want to add the result of an apply()
as new columns. In the current code I have to give the names at two places. Is there a better way doing this?
#!/usr/bin/env python3
import pandas as pd
df = pd.DataFrame(range(3))
def foo(row):
return pd.Series([1, 2], index=['A', 'B']) # first nameing
# second naming
df.loc[:, ['A', 'B']] = df.apply(foo, axis=1)
print(df)
# 0 A B
# 0 0 1 2
# 1 1 1 2
# 2 2 1 2
For me it make sense to specify the names of the new columns via the loc[]
call.
When I would not specify the names via index=['A', 'B']
in the Series()
the result would look like this:
0 A B
0 0 NaN NaN
1 1 NaN NaN
2 2 NaN NaN
Technically I understand why this happens. But from the viewpoint of easy readable and maintainable code I would like to avoid the index=['A', 'B']
and find a better way.
CodePudding user response:
As you you explicitly want to remove index alignment, you could use the underlying numpy array:
df = pd.DataFrame(range(3))
def foo(row):
return pd.Series([1, 2]) # no naming
df.loc[:, ['A', 'B']] = df.apply(foo, axis=1).to_numpy()
CodePudding user response:
Just for fun, if you want to keep the naming inside the function you can try this:
df.join(df.apply(foo,axis=1))
and the result will be
0 A B
0 0 1 2
1 1 1 2
2 2 1 2