Home > Software design >  How to change values in one column based on whether conditions in columns A and B are met in Pandas/
How to change values in one column based on whether conditions in columns A and B are met in Pandas/

Time:09-09

I have a data frame. I want to change values in column C to null values based on whether conditions in columns A and B are met. To do this, I think I need to iterate over the rows of the dataframe, but I can't figure out how:

df = {'A': [1, 4, 1, 4], 'B': [9, 2, 5, 3], 'C': [0, 0, 5, 3]}

dataframe image

I tried something like this:

for row in df.iterrows()
  if df['A'] > 2 and df['B'] == 3:
      df['C'] == np.nan

but I just keep getting errors. Could someone please show me how to do this?

CodePudding user response:

Yours is not a DataFrame, it's a dictionary. This is a DataFrame:

import pandas as pd
df = pd.DataFrame({'A': [1, 4, 1, 4], 'B': [9, 2, 5, 3], 'C': [0, 0, 5, 3]})

It is usually faster to use pandas/numpy arithmetic instead of regular Python loops.

df.loc[(df['A'].values > 2) & (df['B'].values == 3), 'C'] = np.nan

Or if you insist on your way of coding, the code (besides converting df to a real DataFrame) can be updated to:

import numpy as np
import pandas as pd
df = pd.DataFrame({'A': [1, 4, 1, 4], 'B': [9, 2, 5, 3], 'C': [0, 0, 5, 3]})
for i, row in df.iterrows():
    if row.loc['A'] > 2 and row.loc['B'] == 3:
        df.loc[i, 'C'] = np.nan

or

import numpy as np
import pandas as pd
df = pd.DataFrame({'A': [1, 4, 1, 4], 'B': [9, 2, 5, 3], 'C': [0, 0, 5, 3]})
for i, row in df.iterrows():
    if df.loc[i, 'A'] > 2 and df.loc[i, 'B'] == 3:
        df.loc[i, 'C'] = np.nan

CodePudding user response:

You can try

df.loc[(df["A"].values > 2) & (df["B"].values==3), "C"] = None

Using pandas and numpy is way easier for you :D

  • Related