I am trying to remove any occurrence of 'Doctor', 'Honorable', and 'Professor' from a variable in a dataframe. Here is an example of the dataframe:
Name |
---|
professor Rick Smith |
Mark M. Tarleton |
Doctor Charles M. Alexander |
Professor doctor Todd Mckenzie |
Carl L. Darla |
Honorable Billy Darlington |
Observations could have multiple, one, or none of: 'Doctor', 'Honorable', or 'Professor'. Also, the terms could be upper case or lower case.
Any help would be much appreciated!
CodePudding user response:
Use a regex with str.replace
:
regex = '(?:Doctor|Honorable|Professor)\s*'
df['Name'] = df['Name'].str.replace(regex, '', regex=True, case=False)
Output:
Name
0 Rick Smith
1 Mark M. Tarleton
2 Charles M. Alexander
3 Todd Mckenzie
4 Carl L. Darla
5 Billy Darlington