Home > Net >  Conditional assignment to multiple columns in pandas
Conditional assignment to multiple columns in pandas

Time:01-19

Using pandas 1.42

Having a DataFrame with 5 columns: A, B, C, D, E

I need to assign values from columns D and E to columns A and B if the value of column C is true. I want to achieve this in one line using the .loc method.

example

A B C D E
1 4 True 7 10
2 5 False 8 11
3 6 True 9 12

expected result

A B C D E
7 10 True 7 10
2 5 False 8 11
9 12 True 9 12
df = pd.DataFrame(
  {'A': [1, 2, 3], 
  'B': [4, 5, 6], 
  'C': [True, False, True], 
  'D': [7, 8, 9], 
  'E': [10, 11, 12]}
)

df.loc[df['C'], ['A', 'B']] = df[['D', 'E']]

actual result

A B C D E
nan nan True 7 10
2 5 False 8 11
nan nan True 9 12

workaround I figured

df.loc[df['C'], ['A', 'B']] = (df.D[df.C], df.E[df.C])

Seems pandas not getting right the to be assigned values if they come in form of a DataFrame, but it gets it right if you pack it nicely as tuple of Series. Do I get the syntax wrong or is it a bug in pandas?

CodePudding user response:

Use boolean indexing on both sides, and remove index alignment by converting to_numpy array:

m = df['C']
df.loc[m, ['A', 'B']] = df.loc[m, ['D', 'E']].to_numpy()

Or change the column names with set_axis:

df.loc[df['C'], ['A', 'B']] = df[['D', 'E']].set_axis(['A', 'B'], axis=1)

Output:

   A   B      C  D   E
0  7  10   True  7  10
1  2   5  False  8  11
2  9  12   True  9  12
  • Related