Home > Mobile >  Add new columns via apply but specify column names only once
Add new columns via apply but specify column names only once

Time:02-17

I want to add the result of an apply() as new columns. In the current code I have to give the names at two places. Is there a better way doing this?

#!/usr/bin/env python3
import pandas as pd

df = pd.DataFrame(range(3))

def foo(row):
    return pd.Series([1, 2], index=['A', 'B'])  # first nameing

# second naming
df.loc[:, ['A', 'B']] = df.apply(foo, axis=1)

print(df)
#    0  A  B
# 0  0  1  2
# 1  1  1  2
# 2  2  1  2

For me it make sense to specify the names of the new columns via the loc[] call.

When I would not specify the names via index=['A', 'B'] in the Series() the result would look like this:

   0   A   B
0  0 NaN NaN
1  1 NaN NaN
2  2 NaN NaN

Technically I understand why this happens. But from the viewpoint of easy readable and maintainable code I would like to avoid the index=['A', 'B'] and find a better way.

CodePudding user response:

As you you explicitly want to remove index alignment, you could use the underlying numpy array:

df = pd.DataFrame(range(3))

def foo(row):
    return pd.Series([1, 2])  # no naming

df.loc[:, ['A', 'B']] = df.apply(foo, axis=1).to_numpy()

CodePudding user response:

Just for fun, if you want to keep the naming inside the function you can try this:

df.join(df.apply(foo,axis=1))

and the result will be

   0  A  B
0  0  1  2
1  1  1  2
2  2  1  2
  • Related