Home > OS >  Python DataFrame: Remove/Replace part of a string for all values in a column
Python DataFrame: Remove/Replace part of a string for all values in a column

Time:02-12

In a Dataframe "df" I have a column called "Company". In there I have a list of companies that end with "- CP" the problem is that the spaces are not always in the same place and in some of the entries the dash "-" is missing. I want to remove the "-CP" from all entries.

Input

Company
Intest Apple - CP
Intest Apple -CP
Intest Apple-CP
Intest Apple - CP
Intest Apple CP
Howard P Delta - CP

Output

Company
Intest Apple
Intest Apple
Intest Apple
Intest Apple
Intest Apple
Howard P Delta

This is the code that I have, but when I run it nothing changes

df['Company'] = df['Company'].str.replace("-CP'","") 
df['Company'] = df['Company'].str.replace("- CP'","") 
df['Company'] = df['Company'].str.replace(" - CP'","") 

CodePudding user response:

df['Company']=df['Company'].str.replace("-CP","")
df['Company'] = df['Company'].str.replace("- CP","") 
df['Company'] = df['Company'].str.replace(" - CP","") 

CodePudding user response:

You could use str.replace with a regular expression to include the case were the dash can by missing (-?) and all variations of spaces between the CP string.

company = df.Company.str.replace('\s*-?\s*CP\s*$','', regex=True)

Output from company

Out[5]:
0      Intest Apple
1      Intest Apple
2      Intest Apple
3      Intest Apple
4      Intest Apple
5    Howard P Delta
Name: Company, dtype: object
  • Related