Home > Net >  Remove values from specific columns with a condition in Python
Remove values from specific columns with a condition in Python

Time:07-29

I would like to remove values in the value1, value2, value3 and value4 columns if the 'on status' column contains the string 'new'

Data

id  date        location    on status   value1  value2  value3  value 4
CC  1/1/2022    ny          new         12      1       0       1
CC  4/1/2022    ny          new         1       1       8       9
CC  7/1/2022    ny          new         1       1       1       0
CC  10/1/2022   ny          new         1       2       2       1
CC  1/1/2023    ny          ok          1       2       2       1

Desired

id  date        location    on status   value1  value2  value3  value4
CC  1/1/2022    ny          new         
CC  4/1/2022    ny          new         
CC  7/1/2022    ny          new         
CC  10/1/2022   ny          new         
CC  1/1/2023    ny          ok          1       2      2         1
    

This only works on the first 2 columns, but it actually adds two additional columns (value3 and value4 and deletes the data from all rows not just the conditional 'new' Any suggestion is appreciated

Doing

df.loc[(df['on status'] == 'new'),  ['value1', 'value2','value3', 'value4']]= ''

CodePudding user response:

sample input

df = pd.DataFrame({
    
    "on_status" : ["new", "new", "new", "new", "ok"],
    "value1" : [x for x in range(5)],
    "value2" : [x for x in range(5)],
    "value3" : [x for x in range(5)],
    "value4" : [x for x in range(5)],
})

it does work for me using loc

df.loc[(df.on_status == "new"), ["value1", "value2", "value3", "value4"]] = ''

sample output

on_status   value1  value2  value3  value4
0   new             
1   new             
2   new             
3   new             
4   ok  4   4   4   4

CodePudding user response:

df.loc[(df['status'].str.strip() == 'new'),  ['value1', 'value2','value3', 'value4']]= ''

try this as mentioned Naveed, if there are whitespaces around then it will check by stripping white spaces.

CodePudding user response:

for i in range(len(df.index)):
    if df.loc[i,'on status'] == 'new':
        df.loc[i,'value1': 'value 4'] = ' '
df    

CodePudding user response:

here is one way to do it

df.loc[(df['on status'] == 'new'), ['value1', 'value2','value3', 'value4']] = ''

OR

df.loc[(df['on status'].str.strip() == 'new'), ['value1', 'value2','value3', 'value4']] = ''

df
    id  date    location    on status   value1  value2  value3  value4
0   CC  1/1/2022    ny      new                 
1   CC  4/1/2022    ny      new                 
2   CC  7/1/2022    ny      new                 
3   CC  10/1/2022   ny      new                 
4   CC  1/1/2023    ny       ok             1      2       2       1
  • Related