i have a script that runs api calls agains a flask app. i want to create a pandas datafram with the statuscode and the elapsed time of the request which i can write to a csv file. My problem is that only one entry ends up being in the csv file and i dont know why. the headers should be "statuscode" and "elapsed time". when i am printing the statuscode and elapsedtime variables every response is printed and not only one
with this csv file i want to create a grap to visualize the responstimes
i tried to write the "write_df" fuction but ended up using the variables from the requests in the "send_api_request" function.
import requests
import datetime
import concurrent.futures
import csv
import pandas as pd
HOST = 'http://127.0.0.1:5000'
API_PATH = '/'
ENDPOINT = HOST API_PATH
MAX_THREADS = 8
CONCURRENT_THREADS = 10
csv_path = "flasktests.csv"
try:
file = open(csv_path, 'w', newline='')
writer = csv.writer(file)
except:
print("error opening or writing to the CSV file!")
def send_api_request():
try:
#print ('Sending API request: ', ENDPOINT)
r = requests.get(ENDPOINT)
if r.status_code == 200:
#print('Received: ', r.status_code, r.elapsed)
responses = {"statuscode":[r.status_code], "elapsed time": [r.elapsed]}
statuscode = r.status_code
elapsedtime = r.elapsed
print(statuscode, elapsedtime)
df = pd.DataFrame([statuscode,elapsedtime], columns=["statuscode","elapsed time"])
df.to_csv(csv_path, index=False)
elif r.status_code == 417:
print('Received error code:', r.status_code, r.json())
except Exception as e:
print("error",str(e))
def write_df(statuscode, elapsedtime):
print(statuscode,elapsedtime)
df = pd.DataFrame({"statuscode":[statuscode], "elapsed time": [elapsedtime]})
df.to_csv(csv_path, index=False)
print(df)
with concurrent.futures.ThreadPoolExecutor(max_workers=MAX_THREADS) as executor:
futures = [executor.submit(send_api_request) for x in range (CONCURRENT_THREADS)]
executor.shutdown(wait=True)
any ideas what i am doing wrong here? Thank you!
CodePudding user response:
df = pd.DataFrame([statuscode,elapsedtime], columns=["statuscode","elapsed, "elapsed time": [elapsedtime]}) df.to_csv(csv_path, index=False)
Here, you are writing a csv with only one entry, because your dataframe df
only has one row.
And everytime this function is being executed, you will overwrite the existing csv. This is why the output file only contains one line (2 if you count the header, which should be there)
I see you are trying to use multithreading, which is probably not needed. If you do want to keep using it, you will need to make sure that you are not writing to the file with two threads at the same time, which is a lot of overhead. Instead, I would suggest you do something like this:
def send_api_request():
try:
#print ('Sending API request: ', ENDPOINT)
r = requests.get(ENDPOINT)
if r.status_code == 200:
#print('Received: ', r.status_code, r.elapsed)
return r.status_code, r.elapsed
elif r.status_code == 417:
print('Received error code:', r.status_code, r.json())
return r.status_code, None
except Exception as e:
print("error",str(e))
# number of times you want to call the API
nb_api_calls = 10
codes = []
elapsed_times = []
for i in range(nb_api_calls):
code, elapsed_time = send_api_request()
if elapsed_time is None:
# An error happened, choose what do do with that information
# Here I am just skipping it
pass
else:
codes.append(code)
elapsed_times.append(elapsed_time)
df = pd.DataFrame({"statuscode": codes,"elapsed time": elapsed_times})
df.to_csv(csv_path, index=False)
Note that, with the current configuration, the "statuscode"
column will only contain the value 200 many times