Home > OS >  if column years is >=10, user personal details should be replaced with his id (pandas)
if column years is >=10, user personal details should be replaced with his id (pandas)

Time:11-02

I am new to pandas.

Here I am iterating through each row and checking the exit date of a user if his exit date is >= 10 his personal details should be replaced with his id.

I am stuck please help.

for edate in pd.to_datetime(df1['EXIT_DATE']):

    rdelt = relativedelta(datetime.today(),edate)

    df1['years'] = rdelt.years

    # its modifying each row in a DataFrame.
    #df1.loc[flag,['first_name','middel_name','email']] = df1['user_id'] 

CodePudding user response:

EDIT:

Added link to an answer from @Arvind Kumar Avinash explaining "Filtering on dataframe"

Taking @Emi OB comment and adding explanation;

You can create a flag/mask by using the usual "<,>,<=,>=" operators e.g

age = pd.Series([20,23,22,19,30])
age>22 # Series([False,True,False,False,True])

thus you can use that mask to operate on all the True indexes i.e if we want to replace all the age where age>22 (i.e all the index' where we have the True value) with the value 22, we do it simply by

age = pd.Series([20,23,22,19,30])
mask = age>22 # Series([False,True,False,False,True])
age.loc[mask] = 22
age # pd.Series([20,22,22,19,22])

the exact same logic can be used on data-frames

CodePudding user response:

You can try the code below to avoid the loop:

# Ensure EXIT_DATE dtype is a datetime64
df1['EXIT_DATE'] = pd.to_datetime(df['EXIT_DATE'])

df1['years'] = pd.Timestamp.today().year - df1['EXIT_DATE'].dt.year
df1.loc[df1['years'] >= 10, ['first_name','middle_name','email']] = df['user_id']
  • Related