I have a df you can have it by running this:
import numpy as np
import pandas as pd
from io import StringIO
df = """
contract EndDate option
A00118 99999999 AC
A00118 19831231 SLA
A00118 99999999 TPA
A00118 99999999 F
A00118 99999999 FD
"""
df = pd.read_csv(StringIO(df.strip()), sep='\s ',
dtype={"RB": int, "BeginDate": int, "EndDate": int,'ValIssueDate':int,'Valindex0':int})
df
output is:
contract EndDate option
0 A00118 99999999 AC
1 A00118 19831231 SLA
2 A00118 99999999 TPA
3 A00118 99999999 F
4 A00118 99999999 FD
Now I want to apply a logic to each row without using .apply function ,because it is very slow.
The logic is ,if the option equals SLA then the EndDate will be the last 4 digits of its value.
I tried something like this:
df.loc[df['option']=='SLA']['EndDate']=[4:]
But receive syntax error
The correct output should be:
contract EndDate option
0 A00118 99999999 AC
1 A00118 1231 SLA
2 A00118 99999999 TPA
3 A00118 99999999 F
4 A00118 99999999 FD
CodePudding user response:
Use .loc
with the boolean mask for the rows and the column name of interest. The rest is string manipulation and type casting.
>>> df
contract EndDate option
0 A00118 99999999 AC
1 A00118 19831231 SLA
2 A00118 99999999 TPA
3 A00118 99999999 F
4 A00118 99999999 FD
>>> where = df['option'] == 'SLA', 'EndDate'
>>> df.loc[where] = df.loc[where].astype(str).str[-4:].astype(int)
>>> df
contract EndDate option
0 A00118 99999999 AC
1 A00118 1231 SLA
2 A00118 99999999 TPA
3 A00118 99999999 F
4 A00118 99999999 FD