Home > Software engineering >  How to partially replace a string in a column pandas Dataframe python
How to partially replace a string in a column pandas Dataframe python

Time:12-14

I have a dataframe like this:

  Name          Comment
1 Ama           Delay with coming home client to provide place he visited
2 Kofi          Enquiry on apples client to provide refund 
3 Kwame         Equiry on tables client advised with enquiry
4 Theo          Emotional challenge client to provide details
5 Isaiah        Eating on empty stomach client to confirm the issues

But I need it to look like this:

  Name          Comment
1 Ama           Delay with coming home
2 Kofi          Enquiry on apples 
3 Kwame         Enquiry on tables 
4 Theo          Emotional challenge 
5 Isaiah        Eating on empty stomach 

CodePudding user response:

It looks like you want to deleted everything after "client", use a regex with str.replace:

df['Comment'] = df['Comment'].str.replace(r'\s*\bclient\b.*', '',
                                          case=False, regex=True)

Output:

     Name                  Comment
1     Ama   Delay with coming home
2    Kofi        Enquiry on apples
3   Kwame         Equiry on tables
4    Theo      Emotional challenge
5  Isaiah  Eating on empty stomach

regex demo

Regex:

\s*       # match 0 or more spaces
\b        # match word boundary
client    # match "client"
\b        # match word boundary
.*        # match anything until the end

excluding strings that start with "Client":

Use the same regex but add a look-behind to match a non-space: (?<=\S)

df['Comment'] = df['Comment'].str.replace(r'(?<=\S)\s*\bclient.*', '',
                                          case=False, regex=True)

Example:

     Name                   Comment
1     Ama    Delay with coming home
2    Kofi         Enquiry on apples
3   Kwame          Equiry on tables
4    Theo       Emotional challenge
5  Isaiah   Eating on empty stomach
6  Alfred  Client starting sentence

reged demo

CodePudding user response:

Example

data = [['Ama', 'Delay with coming home client to provide place he visited'], 
        ['Kofi', 'Enquiry on apples client to provide refund '], 
        ['Kwame', 'Equiry on tables client advised with enquiry'], 
        ['Theo', 'Emotional challenge client to provide details'], 
        ['Isaiah', 'Eating on empty stomach client to confirm the issues'], 
        ['Amy', 'client is smart']]
df = pd.DataFrame(data, columns=['Name', 'Comment'])

df

    Name    Comment
0   Ama     Delay with coming home client to provide place...
1   Kofi    Enquiry on apples client to provide refund
2   Kwame   Equiry on tables client advised with enquiry
3   Theo    Emotional challenge client to provide details
4   Isaiah  Eating on empty stomach client to confirm the ...
5   Amy     client is smart

Code

split by ' client' and take first

df['Comment'] = df['Comment'].str.split(' client').str[0]

df

    Name    Comment
0   Ama     Delay with coming home
1   Kofi    Enquiry on apples
2   Kwame   Equiry on tables
3   Theo    Emotional challenge
4   Isaiah  Eating on empty stomach
5   Amy     client is smart
  • Related