I have a list of airports I would like to get flight information for from an API. The API only allows users to search one airport as a time. I tried to create a loop that would iterate over the list of airports but it was unsuccessful. I have 5,000 airports I need to pass to the API. Below I give a sample list of airports for example purposes.
apiKey = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
apiUrl = 'https://xxxxxxxxxxxxxxxxxxx.com/xxxxxxx/'
auth_header = {'x-apikey':apiKey}
airports = [['HLZN'], ['HLLQ'],['HLLB'],['HLGT'],['HLMS'],['HLLS'],['HLTQ'],['HLLT'],['HLLM']]
payload = {'max_pages': 1}
#Create an empty list to hold responses
json_responses = []
#Iterate through list
for airport in airports:
response = requests.get(apiUrl f"airports/{airports}/flights",params=payload,
headers=auth_header)
if response.status_code == 200:
print(response.json())
else:
print("Error executing request")
results = response.json
#This presumes the JSON response is a dict, if the response is a list, use extend instead of
append
json_responses.append(response.json())
#Create a DataFrame from a list containing all of the gathered responses.
all_acct_df = pd.DataFrame(json_responses)
The error I get is : "Error Parsing Arguments Parsing_ERROR Invalid argument ID supplied 400"
I tried passing one airport ID through this looping code and it goes through but it's not iterating over lists.
I'm new to looping and API's so any help would be greatly appreciated. Thank you.
CodePudding user response:
If "The API only allows users to search one airport as a time" then you may have to call it multiple times without hacks. But you may want to consider a ProcessPoolExecutor or task queue manager like Celery.
responses = []
def get_response(airport):
# call the API and storge the data
res = requests.get(url)
responses.appnd(res.json())
for airport in airports:
ap = airport[0] # as your data
with concurrent.futures.ProcessPoolExecutor(max_work=50) as executor:
executor.submit(call, ap)
CodePudding user response:
You are using airports
instead of airport[0]
in the URL, so the API is receiving the entire list of airports instead of just one airport at a time.
This should fix it;
response = requests.get(apiUrl f"airports/{airport[0]}/flights",params=payload, headers=auth_header)
Explanation
In your original code, the issue is that you are using the airports
list instead of the airport
iterator variable in the requests.get()
method. To illustrate the difference;
# Sample list of airports
airports = [['HLZN'], ['HLLQ'],['HLLB'],['HLGT'],['HLMS'],['HLLS'],['HLTQ'],['HLLT'],['HLLM']]
# Iterate through the list of airports
for airport in airports:
print(airport)
Output:
['HLZN']
['HLLQ']
['HLLB']
['HLGT']
['HLMS']
['HLLS']
['HLTQ']
['HLLT']
['HLLM']
Why are we using [0]
when sending the request? It's because the api doesn't accept a list which is why we're accessing the 1st element of the airport
iterator variable to get a string.