I have a Flask API that should get a request, do some stuff with selenium, and then return something to the user.
VERSION 1
import flask
from multiprocessing import Value
from selenium import webdriver
app = flask.Flask(__name__)
app.config["DEBUG"] = False
@app.route('/api/url', methods=['GET', 'POST'])
def site():
# do some stuff ...
with open(os.path.normpath(os.getcwd() "\\file.mp3"), 'wb') as f:
f.write(<something>)
# ... do some other stuff and return a variable
def home():
return "<h1>Distant Reading Archive</h1><p>This site is a prototype API for distant reading of science fiction novels.</p>"
app.run(host="0.0.0.0", threaded=True)
This code would technically work perfectly, if it wasn't that I need to manage multiple requests at once: for example, if I have two requests that are being processed in the same time, the second request would overwrite the file on the file created by the first request and that would be a problem.
VERSION 2
To avoid that all the threads overwrite the same file, I changed the code by including the counter
variable that acts like a request counter (from this answer Increment counter for every access to a Flask view - answer).
The following code will create a file named file<n>.mp3
for each request, where <n>
is the number of the request (the <n>
-th request), and so I will have a different file for each request.
import flask
from multiprocessing import Value
from selenium import webdriver
app = flask.Flask(__name__)
app.config["DEBUG"] = False
@app.route('/api/url', methods=['GET', 'POST'])
counter = Value('i', 0)
def site():
with counter.get_lock():
counter.value = 1
out = counter.value
# do some stuff ...
with open(os.path.normpath(os.getcwd() "\\file{}.mp3".format(out)), 'wb') as f:
f.write(<something>)
# ... do some other stuff and return a variable
def home():
return "<h1>Distant Reading Archive</h1><p>This site is a prototype API for distant reading of science fiction novels.</p>"
app.run(host="0.0.0.0", threaded=True)
PROBLEMS WITH VERSION 2
The problem with version 2 of the code is that multithreading apparently stops working, and the code processes only one request at time.
The fact that multithreading stopped working seems related to the fact that I have included the counter, because in version 1 multithreading did actually work and the API processed all the different requests separately (but the code still didn't behave as I wanted to because of the overwriting problem)
QUESTION
How can I count all the requests and still manage to process each of them on its own/with its file, while having multiple requests being processed at once?
CodePudding user response:
The problem is that you have too much code wrapped inside with counter.get_lock()
. For as long as you have this lock, no other thread can run.
Instead you want:
with counter.get_lock():
counter.value = 1
out = counter.value
do rest of the stuff outside of the lock
CodePudding user response:
As an alternative to my previous answer. Your code above doesn't make it clear what the purpose of the temporary file is. Does it need to survive the call to /api/url, or is it temporary storage? The tempfile library might be a safer alternative to what you're trying.