Home > Software engineering >  Change values in a row of pandas df based on a condition
Change values in a row of pandas df based on a condition

Time:03-22

I have a df that looks like below and I need to change value of one of the rows based on a condition so that if it is > 20 it is changed to 20.

      col1 col2  col3  col4  col5  col6
row1  str1  str   15.3   25.6   3.5   4.5
row2  str2  str   25.8   55.4   4.5   55
row3  str3  str   35.3   45.8   7.5   65
row4  str4  str   45.2   55.7   6.5   7.5

For columns I have tried the below and which worked:

df.col1 = np.where(df.col1 > 20, 20, df.col1)

For rows however I tried multiple line with loc and mask but it just doesn't work. If we take row3 expected output is:

  col1 col2  col3  col4  col5  col6
row1  str1  str   15.3   25.6   3.5   4.5
row2  str2  str   25.8   55.4   4.5   55
row3  str3  str   20.0   20.0   7.5   20
row4  str4  str   45.2   55.7   6.5   7.5

CodePudding user response:

IIUC, this should work. You could use to_numeric to get the numeric values, clip to change values and fillna get string values back:

df.loc['row3'] = pd.to_numeric(df.loc['row3'], errors='coerce').clip(upper=20).fillna(df.loc['row3'])

Output:

      col1 col2  col3  col4  col5  col6
row1  str1  str  15.3  25.6   3.5   4.5
row2  str2  str  25.8  55.4   4.5  55.0
row3  str3  str  20.0  20.0   7.5  20.0
row4  str4  str  45.2  55.7   6.5   7.5

CodePudding user response:

So you can do a df.clip after selecting the dtypes and then combine_first:

rows = ['row3'] #more row indices here in a list
df_out = df.select_dtypes('number').clip(upper=20).loc[rows].combine_first(df)

print(df_out)

      col1 col2  col3  col4  col5  col6
row1  str1  str  15.3  25.6   3.5   4.5
row2  str2  str  25.8  55.4   4.5  55.0
row3  str3  str  20.0  20.0   7.5  20.0
row4  str4  str  45.2  55.7   6.5   7.5
  • Related