I would like to remove values in the value1, value2, value3 and value4 columns if the 'on status' column contains the string 'new'
Data
id date location on status value1 value2 value3 value 4
CC 1/1/2022 ny new 12 1 0 1
CC 4/1/2022 ny new 1 1 8 9
CC 7/1/2022 ny new 1 1 1 0
CC 10/1/2022 ny new 1 2 2 1
CC 1/1/2023 ny ok 1 2 2 1
Desired
id date location on status value1 value2 value3 value4
CC 1/1/2022 ny new
CC 4/1/2022 ny new
CC 7/1/2022 ny new
CC 10/1/2022 ny new
CC 1/1/2023 ny ok 1 2 2 1
This only works on the first 2 columns, but it actually adds two additional columns (value3 and value4 and deletes the data from all rows not just the conditional 'new' Any suggestion is appreciated
Doing
df.loc[(df['on status'] == 'new'), ['value1', 'value2','value3', 'value4']]= ''
CodePudding user response:
sample input
df = pd.DataFrame({
"on_status" : ["new", "new", "new", "new", "ok"],
"value1" : [x for x in range(5)],
"value2" : [x for x in range(5)],
"value3" : [x for x in range(5)],
"value4" : [x for x in range(5)],
})
it does work for me using loc
df.loc[(df.on_status == "new"), ["value1", "value2", "value3", "value4"]] = ''
sample output
on_status value1 value2 value3 value4
0 new
1 new
2 new
3 new
4 ok 4 4 4 4
CodePudding user response:
df.loc[(df['status'].str.strip() == 'new'), ['value1', 'value2','value3', 'value4']]= ''
try this as mentioned Naveed, if there are whitespaces around then it will check by stripping white spaces.
CodePudding user response:
for i in range(len(df.index)):
if df.loc[i,'on status'] == 'new':
df.loc[i,'value1': 'value 4'] = ' '
df
CodePudding user response:
here is one way to do it
df.loc[(df['on status'] == 'new'), ['value1', 'value2','value3', 'value4']] = ''
OR
df.loc[(df['on status'].str.strip() == 'new'), ['value1', 'value2','value3', 'value4']] = ''
df
id date location on status value1 value2 value3 value4
0 CC 1/1/2022 ny new
1 CC 4/1/2022 ny new
2 CC 7/1/2022 ny new
3 CC 10/1/2022 ny new
4 CC 1/1/2023 ny ok 1 2 2 1