I am new to pandas.
Here I am iterating through each row and checking the exit date of a user if his exit date is >= 10 his personal details should be replaced with his id.
I am stuck please help.
for edate in pd.to_datetime(df1['EXIT_DATE']):
rdelt = relativedelta(datetime.today(),edate)
df1['years'] = rdelt.years
# its modifying each row in a DataFrame.
#df1.loc[flag,['first_name','middel_name','email']] = df1['user_id']
CodePudding user response:
EDIT:
Added link to an answer from @Arvind Kumar Avinash explaining "Filtering on dataframe"
Taking @Emi OB comment and adding explanation;
You can create a flag/mask by using the usual "<,>,<=,>=" operators e.g
age = pd.Series([20,23,22,19,30])
age>22 # Series([False,True,False,False,True])
thus you can use that mask to operate on all the True
indexes i.e if we want to replace all the age
where age>22
(i.e all the index' where we have the True
value) with the value 22
, we do it simply by
age = pd.Series([20,23,22,19,30])
mask = age>22 # Series([False,True,False,False,True])
age.loc[mask] = 22
age # pd.Series([20,22,22,19,22])
the exact same logic can be used on data-frames
CodePudding user response:
You can try the code below to avoid the loop:
# Ensure EXIT_DATE dtype is a datetime64
df1['EXIT_DATE'] = pd.to_datetime(df['EXIT_DATE'])
df1['years'] = pd.Timestamp.today().year - df1['EXIT_DATE'].dt.year
df1.loc[df1['years'] >= 10, ['first_name','middle_name','email']] = df['user_id']