Pandas how to select rows based on one column value and then change another column's value-CodePudding

I have a df you can have it by running this:

import numpy as np
import pandas as pd
from io import StringIO
df = """
  contract      EndDate      option
  A00118        99999999      AC
  A00118        19831231      SLA
  A00118        99999999      TPA
  A00118        99999999      F
  A00118        99999999      FD
"""

df = pd.read_csv(StringIO(df.strip()), sep='\s ', 
                  dtype={"RB": int, "BeginDate": int, "EndDate": int,'ValIssueDate':int,'Valindex0':int})

df

output is:

contract    EndDate option
0   A00118  99999999    AC
1   A00118  19831231    SLA
2   A00118  99999999    TPA
3   A00118  99999999    F
4   A00118  99999999    FD

Now I want to apply a logic to each row without using .apply function ,because it is very slow.

The logic is ,if the option equals SLA then the EndDate will be the last 4 digits of its value.

I tried something like this:

df.loc[df['option']=='SLA']['EndDate']=[4:]

But receive syntax error

The correct output should be:

contract    EndDate option
0   A00118  99999999    AC
1   A00118  1231        SLA
2   A00118  99999999    TPA
3   A00118  99999999    F
4   A00118  99999999    FD

CodePudding user response：

Use .loc with the boolean mask for the rows and the column name of interest. The rest is string manipulation and type casting.

>>> df 
  contract   EndDate option
0   A00118  99999999     AC
1   A00118  19831231    SLA
2   A00118  99999999    TPA
3   A00118  99999999      F
4   A00118  99999999     FD
>>> where = df['option'] == 'SLA', 'EndDate'
>>> df.loc[where] = df.loc[where].astype(str).str[-4:].astype(int)
>>> df 
  contract   EndDate option
0   A00118  99999999     AC
1   A00118      1231    SLA
2   A00118  99999999    TPA
3   A00118  99999999      F
4   A00118  99999999     FD