I have a Dataframe df
:
name rank
A captain, general, soldier
B general, foo, major
C foo
D captain, major
E foo, foo, foo
I want to check if any cell in the column rank
consists of foo
and if it does replace the whole cell with foo
.
Expected output:
name rank
A captain, general, soldier
B foo
C foo
D captain, major
E foo
How can I do this?
CodePudding user response:
df['rank'].replace('.*foo.*', 'foo', regex=True, inplace=True)
# OR
df['rank'].mask(df['rank'].str.contains('foo'), 'foo', inplace=True)
# OR
df.loc[df['rank'].str.contains('foo'), 'rank'] = 'foo'
Output:
name rank
0 A captain, general, soldier
1 B foo
2 C foo
3 D captain, major
4 E foo
CodePudding user response:
You can apply
a lambda function to the column :
df["rank"] = df["rank"].apply(lambda x: "foo" if "foo" in x.split(", ") else x)
Splitting on the separator allows to check for words. For example, the world "foobar" wouldn't trigger the transformation on its row.
Edit: Thanks to BeRT2me for suggesting to split by ', '.
CodePudding user response:
mask = df['rank'].str.contains('foo')
df.loc[mask, 'rank'] = 'foo'
CodePudding user response:
if df['rank'].str.contains('foo').any():
df['rank']='foo'