My goal is to get the PQRI table (second table of the two listed) from this
But when I make the request I only get the following response:
"<c_start></c_start><c_total></c_total>getPQRIData: No base column '0'\u003cbr\u003e\u000a"
Any idea what I need to change to get the desired output?
CodePudding user response:
You can't send that form data as a dictionary/json. Send it as a string and it should work:
import pandas as pd
import requests
s = requests.Session()
s.get('https://apps.usp.org/app/USPNF/columnsDB.html')
cookies = s.cookies.get_dict()
cookieStr = ''
for k,v in cookies.items():
cookieStr = f'{k}={v};'
url = "https://apps.usp.org/ajax/USPNF/columnsDB.php"
headers = {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9",
"Connection": "keep-alive",
"Content-Length": "201",
"Content-Type": "application/x-www-form-urlencoded",
"Cookie": cookieStr,
"Host": "apps.usp.org",
"Origin": "https://apps.usp.org",
"Referer": "https://apps.usp.org/app/USPNF/columnsDB.html",
"sec-ch-ua": "Not A;Brand ;v=99, Chromium;v=99, Google Chrome;v=99",
"sec-ch-ua-mobile" : "?0",
"sec-ch-ua-platform": "Windows",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"User-Agent": "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.141 Safari/537.36",
"X-Powered-By": "CPAINT v2.1.0 :: http://sf.net/projects/cpaint",
}
final_df = pd.DataFrame()
nextPage = True
page = 0
while nextPage == True:
i = page*10
payload = f'cpaint_function=updatePQRIResults&cpaint_argument[]=Acclaim 120 C18&cpaint_argument[]=1&cpaint_argument[]=0&cpaint_argument[]=0&cpaint_argument[]=2.8&cpaint_argument[]={i}&cpaint_response_type=OBJECT'
response = s.post(url, data=payload, headers=headers).text
df = pd.read_xml(response).iloc[3:-1,3:]
if (df.iloc[0]['psr'] == 0) and (len(df) == 1):
nextPage = False
final_df = final_df.drop_duplicates().reset_index(drop=True)
print('Complete')
else:
final_df = pd.concat([final_df, df], axis=0)
print(f'Page: {page 1}')
page =1
Output:
print(final_df)
psr psf psn ... psvb psvc28 psvc70
0 0.0 0.00 Acclaim 120 C18 ... -0.027 0.086 -0.002
1 1.0 0.24 TSKgel ODS-100Z ... -0.031 -0.064 -0.161
2 2.0 0.67 Inertsil ODS-3 ... -0.023 -0.474 -0.334
3 3.0 0.74 LaChrom C18 ... -0.006 -0.278 -0.120
4 4.0 0.80 Prodigy ODS(3) ... -0.012 -0.195 -0.134
.. ... ... ... ... ... ... ...
753 753.0 29.55 Cosmosil 5PYE ... 0.092 0.521 1.318
754 754.0 30.44 BioBasic Phenyl ... 0.217 0.014 0.390
755 755.0 34.56 Microsorb-MV 100 CN ... -0.029 0.148 0.785
756 756.0 41.62 Inertsil ODS-EP ... 0.050 -0.620 -0.070
757 757.0 41.84 Flare C18 ... 0.966 -0.507 1.178
[758 rows x 12 columns]