python script to run and generate csv file every 10 mins-CodePudding

So, I have a python code like this:

import csv
import pandas as pd
import numpy as np
import time
from pandas import Series,DataFrame


    df = pd.read_csv('C:/Users/Desktop/case_study_1.csv',low_memory=False)

    df.head()

    #convert interaction_time to date time format
    df.interaction_time = pd.to_datetime(df.interaction_time)

    #remove null on merchant column
    df_remove_null = df.dropna(subset=['merchant'])
    #count added, comfirmed txn
    df_cnt =    df_remove_null.groupby([pd.Grouper(key='interaction_time',freq='H'),df_remove_null.fullVisitorid,df_remove_null.action_type]).size().reset_index(name='count')
df_final_cnt = df_cnt.groupby(['interaction_time','action_type'])['fullVisitorid'].size().reset_index(name='count')

    #export csv file
    df_final_cnt.to_csv(r'C:\Users\Desktop\filename12.csv',index = False, columns = ["interaction_time","action_type","count"])

As you can see, the code outputs a csv file. I saved the csv file to my local directory. All I want to do is just to run the code automatically every 10mins and generate a new csv file. So, every 10mins the new csv file will overwrite the old one.

I dont have much knowledge about automation so any kind of help will be greatly appreciated.

I tried for loop with range(100) but the error show: IndentationError: expected an indented block

Thanks.

CodePudding user response：

Adding this around your code will do the job every ten minutes if the script is running constantly

while(True):

... your code here ...

time.sleep(600)

The indentation error is formatting you will need to find where you are formatting wrong, I recommend looking into a formatting/linting tool for this

CodePudding user response：

You can put all of the work in a function and call this function every 10 minute using modules like sched.

import sched, time
sd = sched.scheduler(time.time, time.sleep)
def your_func(sc): 
    df = pd.read_csv('C:/Users/Desktop/case_study_1.csv',low_memory=False)

    df.head()

    #convert interaction_time to date time format
    df.interaction_time = pd.to_datetime(df.interaction_time)

    #remove null on merchant column
    df_remove_null = df.dropna(subset=['merchant'])
    #count added, comfirmed txn
    df_cnt =  df_remove_null.groupby([pd.Grouper(key='interaction_time',freq='H'),df_remove_null.fullVisitorid,df_remove_null.action_type]).size().reset_index(name='count')
    df_final_cnt = df_cnt.groupby(['interaction_time','action_type'])['fullVisitorid'].size().reset_index(name='count')

    #export csv file
    df_final_cnt.to_csv(r'C:\Users\Desktop\filename12.csv',index = False, columns = ["interaction_time","action_type","count"])
    sd.enter(600, 1, your_func, (sc,))

sd.enter(600, 1, your_func, (sd,))
sd.run()

What this is doing is, between two execution it is giving a 10 min of gap. (In case your code execution time is 2 min then, it would execute every 12 minutes).

CodePudding user response：

the most simple solution I think is:

import time
while True:
    # your script
    time.sleep(10) ```


This is an infinite loop, you can use a condition for break.

CodePudding user response：

If not limit to implement in python, a simple solution is using Windows Task Schedule to execute the script every 10 minutes.

Please refer following topic: Run a task every x-minutes with Windows Task Scheduler