Hope whoever is reading this is well.
What I am trying to do is extract a table of data from the NIST ILThermo website on viscosity of pure (single component) Ionic liquids and conditions it was measured at. I am using this code, by a user called HedgeHog, however it overwrites itself not showing all the different temperatures and their viscosities. Instead it shows the last temperature and viscosity across the entire table.
Here is the code:
import requests
import pandas as pd
prop = 'jVUM'
url = f'https://ilthermo.boulder.nist.gov/ILT2/ilsearch?cmp=&ncmp=1&year=&auth=&keyw=&prp={prop}'
ref_data = requests.get(url).json()
#This line makes an HTTP GET request to the API endpoint specified by the url variable.
#The response is converted to a JSON object and stored in the ref_data variable.
data = []
# This line initializes an empty list data that will be used to store the final processed data.
for e in ref_data['res'][:1]:
#This line starts a for loop that will iterate through the elements (remove 1 for all of them to be iterated) of the
#res key of the ref_data JSON object. The variable e will hold the value of each iteration.
d = dict(zip(ref_data['header'],e))
#This line creates a dictionary d by zipping the header key of the ref_data JSON object with the value of each
#iteration e using the zip function.
set_data = requests.get(f"https://ilthermo.boulder.nist.gov/ILT2/ilset?set={d['setid']}").json()
#This line makes another HTTP GET request to retrieve additional data for each setid. The response is converted
#to a JSON object and stored in the set_data variable. The setid is passed as a query parameter in the URL.
header = [item for items in set_data['dhead'] for item in items if item and item != 'Liquid']
header.append('Liquid')
#This line creates a header by flattening the dhead key of the set_data JSON object and appending the string 'Liquid' to it.
for x in [[item for items in sublist for item in items] for sublist in set_data['data']]:
#This line starts another for loop that will iterate through the data in the data key of the set_data JSON object.
#The variable x will hold the value of each iteration.
d.update(
dict(
zip(header, x)
)
)
data.append(d)
#This line updates the d dictionary by zipping the header with the value of each iteration x using the
pd.DataFrame(data)
And here is the output
setid ref prp phases cmp1 cmp2 cmp3 np Visc nm1 Temperature, K Pressure, kPa Viscosity, Pa•s Liquid
0 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
1 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
2 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
3 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
4 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
... ... ... ... ... ... ... ... ... ... ... ... ... ...
495 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
496 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
497 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
498 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
499 dQYEM Safarov et al. (2021b) Viscosity Liquid AAiERH None None 500 1-ethyl-3-methylimidazolium dicyanamide 453.193 101.325 0.00744 0.00018
500 rows × 13 columns
It is not very clear, however as you can see, the temperature 454.193 is the temperature at the final value of the table I am extracting data from. This is shown throughout the table, showing the final viscosity conditions multiple times rather than a variation.
I think the error is in the x loop, but I can't seem to figure it out. If anyone has any advice that would be great.
Thanks :)
CodePudding user response:
Every row in your observations (the result of your second api request) need to have data appended to a list. The way I would look to do that is via setdefault()
import requests
import pandas
import json
prop = 'jVUM'
url = f'https://ilthermo.boulder.nist.gov/ILT2/ilsearch?cmp=&ncmp=1&year=&auth=&keyw=&prp={prop}'
ref_data = requests.get(url).json()
ref_headers = ref_data['header']
ref_rows = ref_data['res'][:5] ## test with 5 rows max
ref_data = [dict(zip(ref_headers, row)) for row in ref_rows]
for ref_row in ref_data:
set_data = requests.get(f"https://ilthermo.boulder.nist.gov/ILT2/ilset?set={ref_row['setid']}").json()
set_headers = [
item
for rows in set_data['dhead']
for item in rows
if item
]
set_rows = [
[
cell
for item in rows
for cell in item
if cell
]
for rows in set_data['data'][:5] ## test with 5 rows max
]
set_data = [zip(set_headers, row) for row in set_rows]
for observation in set_data:
for key, value in observation:
ref_row.setdefault(key, []).append(value)
df = pandas.DataFrame(ref_data)
print(df)
that gives me:
setid ref prp ... Pressure, kPa Viscosity, Pa•s Liquid
0 dQYEM Safarov et al. (2021b) Viscosity ... [101.325, 101.325, 101.325, 101.325, 101.325] [0.0627, 0.0625, 0.0605, 0.0604, 0.0635] [0.0014, 0.0014, 0.0014, 0.0014, 0.0014]
1 oyqwN Safarov et al. (2018c) Viscosity ... [101.325, 101.325, 101.325, 101.325, 101.325] [8.278, 7.97, 7.82, 7.509, 6.771] [0.88, 0.83, 0.81, 0.78, 0.69]
2 jOYXL Safarov et al. (2017a) Viscosity ... [100, 100, 100, 100, 100] [0.797, 0.784, 0.745, 0.736, 0.688] [0.045, 0.044, 0.041, 0.04, 0.037]
3 dRKZY Sequeira et al. (2020) Viscosity ... [210, 210, 210, 210, 210] [0.0432, 0.0433, 0.0433, 0.0434, 0.0434] [0.0018, 0.0018, 0.0018, 0.0018, 0.0018]
4 rpzLs Safarov et al. (2022) Viscosity ... [101.325, 101.325, 101.325, 101.325, 101.325] [0.1108, 0.1076, 0.1111, 0.1087, 0.0999] [0.009, 0.0084, 0.0091, 0.0086, 0.0071]
Which is what I hope you are after.