In a Dataframe "df" I have a column called "Company". In there I have a list of companies that end with "- CP" the problem is that the spaces are not always in the same place and in some of the entries the dash "-" is missing. I want to remove the "-CP" from all entries.
Input
Company |
---|
Intest Apple - CP |
Intest Apple -CP |
Intest Apple-CP |
Intest Apple - CP |
Intest Apple CP |
Howard P Delta - CP |
Output
Company |
---|
Intest Apple |
Intest Apple |
Intest Apple |
Intest Apple |
Intest Apple |
Howard P Delta |
This is the code that I have, but when I run it nothing changes
df['Company'] = df['Company'].str.replace("-CP'","")
df['Company'] = df['Company'].str.replace("- CP'","")
df['Company'] = df['Company'].str.replace(" - CP'","")
CodePudding user response:
df['Company']=df['Company'].str.replace("-CP","")
df['Company'] = df['Company'].str.replace("- CP","")
df['Company'] = df['Company'].str.replace(" - CP","")
CodePudding user response:
You could use str.replace
with a regular expression to include the case were the dash can by missing (-?
) and all variations of spaces between the CP
string.
company = df.Company.str.replace('\s*-?\s*CP\s*$','', regex=True)
Output from company
Out[5]:
0 Intest Apple
1 Intest Apple
2 Intest Apple
3 Intest Apple
4 Intest Apple
5 Howard P Delta
Name: Company, dtype: object