How do you remove text in paratheses (paratheses included) from a df column? For example:
index | description |
---|---|
0 | Beef (Cow)) |
1 | Pork (Pig) |
2 | Hot Dog (Pig) |
3 | Chicken (Chicken) |
4 | Fish Sticks (Fish)) |
Should be:
index | product |
---|---|
0 | Beef |
1 | Pork |
2 | Hot Dog |
3 | Chicken |
4 | Fish Sticks |
CodePudding user response:
Use str.replace
with a Regular expression, like so:
df["description"] = df["description"].str.replace(r'\s \([^()]*\)', '')
\s
matches any whitespace(s) before the parentheses\(
matches(
literally[^()]*
matches any character that is not(
or)
, the*
makes it repeat\)
matches)
literally
CodePudding user response:
One way using pandas.Series.str.replace
:
df["description"] = df["description"].str.replace("\( . ?\) ", "", regex=True)
print(df)
Output:
index description
0 0 Beef
1 1 Pork
2 2 Hot Dog
3 3 Chicken
4 4 Fish Sticks