Home > Blockchain >  How can I make my pandas DataFrame loop more efficient when trying to append to a list
How can I make my pandas DataFrame loop more efficient when trying to append to a list

Time:12-24

This loop currently takes 15 minutes to run. Are there any tools or ideas on how I can improve the speed of this? There are about 75K rows to iterate through.

I did try appending it directly into the dataframe instead of using the list, the performance was the same.

I'm basically trying to turn every date in the dataframe to the first of the month. ie 2021-01-19 would turn into 2021-01-01

fiscal_list = []
for index, row in df.iterrows():
    clean_date = df['Original Due Date'].str[:8].iloc[1]   "01"
    fiscal_list.append(clean_date) 
df['fiscal_list'] = fiscal_list

CodePudding user response:

Use pd.offsets.MonthBegin():

df["fiscal_list"] = df["Original Due Date"]   pd.offsets.MonthBegin(1)

CodePudding user response:

This should be faster:

df['date'] = pd.to_datetime(df['date'])
df['date'] -= pd.DateOffset(days=1) * (df.date.dt.day - 1)
  • Related