Home > Mobile >  Pandas how to select rows based on one column value and then change another column's value
Pandas how to select rows based on one column value and then change another column's value

Time:02-26

I have a df you can have it by running this:

import numpy as np
import pandas as pd
from io import StringIO
df = """
  contract      EndDate      option
  A00118        99999999      AC
  A00118        19831231      SLA
  A00118        99999999      TPA
  A00118        99999999      F
  A00118        99999999      FD
"""

df = pd.read_csv(StringIO(df.strip()), sep='\s ', 
                  dtype={"RB": int, "BeginDate": int, "EndDate": int,'ValIssueDate':int,'Valindex0':int})

df

output is:

contract    EndDate option
0   A00118  99999999    AC
1   A00118  19831231    SLA
2   A00118  99999999    TPA
3   A00118  99999999    F
4   A00118  99999999    FD

Now I want to apply a logic to each row without using .apply function ,because it is very slow.

The logic is ,if the option equals SLA then the EndDate will be the last 4 digits of its value.

I tried something like this:

df.loc[df['option']=='SLA']['EndDate']=[4:]

But receive syntax error

The correct output should be:

contract    EndDate option
0   A00118  99999999    AC
1   A00118  1231        SLA
2   A00118  99999999    TPA
3   A00118  99999999    F
4   A00118  99999999    FD

CodePudding user response:

Use .loc with the boolean mask for the rows and the column name of interest. The rest is string manipulation and type casting.

>>> df 
  contract   EndDate option
0   A00118  99999999     AC
1   A00118  19831231    SLA
2   A00118  99999999    TPA
3   A00118  99999999      F
4   A00118  99999999     FD
>>> where = df['option'] == 'SLA', 'EndDate'
>>> df.loc[where] = df.loc[where].astype(str).str[-4:].astype(int)
>>> df 
  contract   EndDate option
0   A00118  99999999     AC
1   A00118      1231    SLA
2   A00118  99999999    TPA
3   A00118  99999999      F
4   A00118  99999999     FD
  • Related