I want to know in which French university libraries 3 books are located (corresponding to 3 urls). I want to get the json data of these 3 urls but all the results are not the same : if a book is in several french libraries, a list is created, but if they are in only 1 library, the results are in a dictionary. This disturbs the dataframe I am trying to obtain.
This is my list of url :
urls
['https://www.sudoc.fr/services/multiwhere/23284545X&format=text/json',
'https://www.sudoc.fr/services/multiwhere/056068646&format=text/json',
'https://www.sudoc.fr/services/multiwhere/244974632&format=text/json']
This is my loop :
data=[]
for u in urls:
req=requests.get(u)
wb=req.json()["sudoc"]["query"]["result"]["library"]
data.append(wb)
data2=pd.DataFrame(data).stack().apply(pd.Series)
data2
This is what i get :
0 latitude longitude rcr shortname
0 0 NaN 48.5871803 7.7551573 674821001 STRASBOURG-BNU
1 NaN 48.5789749 7.7651191 674822225 STRASBOURG-Orientales
2 NaN 48.8492618 2.3433311 751052105 PARIS-BIS, Fonds général
3 NaN 48.8467139 2.3463854 751052116 PARIS-Bib. Sainte Geneviève
4 NaN 48.8274879 2.3761096 751132108 PARIS-BULAC
1 0 NaN 48.5871803 7.7551573 674821001 STRASBOURG-BNU
1 NaN 48.8274879 2.3761096 751052201 PARIS-BULAC-IEI J. Darmesteter
2 NaN 48.846328 2.351046 751055408 PARIS-Bib. Société asiatique
2 0 rcr NaN NaN NaN NaN
1 latitude NaN NaN NaN NaN
2 shortname NaN NaN NaN NaN
3 longitude NaN NaN NaN NaN
It doesn't work for the last book because json results are not in a list like the two other ones.
Could you help me with that ?
Thanks ! :)
CodePudding user response:
I don't know what the dataframe should exactly look like at the end, but just put the dict into a list and the magic is hopefully done :)
import requests
import pandas as pd
urls = ['https://www.sudoc.fr/services/multiwhere/23284545X&format=text/json',
'https://www.sudoc.fr/services/multiwhere/056068646&format=text/json',
'https://www.sudoc.fr/services/multiwhere/244974632&format=text/json']
data = []
for url in urls:
response = requests.get(url)
wb = response.json()["sudoc"]["query"]["result"]["library"]
if type(wb) == dict:
wb = [wb]
data.append(wb)
data2 = pd.DataFrame(data).stack().apply(pd.Series)
print(data2)