I am trying to connect 2 modules in vba such that the output of the first module (geturl) feeds into the other (getdata).
Get Url to look up the dossier URL online for substances entered in column A e.g. Acetone
or alternatively the CAS number in column B can be used (see image below). Note: currently only looks up for substance info in A1 or B1.
Public Function GetUrl() As String
Const Url = "https://echa.europa.eu/information-on-chemicals/registered-substances?p_p_id=dissregisteredsubstances_WAR_dissregsubsportlet&p_p_lifecycle=1&p_p_state=normal&p_p_mode=view&_dissregisteredsubstances_WAR_dissregsubsportlet_javax.portlet.action=dissRegisteredSubstancesAction"
Set oHtml = New HTMLDocument
Set oHttp = CreateObject("MSXML2.XMLHTTP")
Set MyDict = CreateObject("Scripting.Dictionary")
SubstanceName = Cells(1, 1)
CASNumber = Cells(1, 2)
MyDict("_dissregisteredsubstances_WAR_dissregsubsportlet_disreg_name") = SubstanceName
MyDict("_dissregisteredsubstances_WAR_dissregsubsportlet_disreg_cas-number") = CASNumber
MyDict("_disssimplesearchhomepage_WAR_disssearchportlet_disclaimer") = "true"
MyDict("_disssimplesearchhomepage_WAR_disssearchportlet_disclaimerCheckbox") = "on"
payload = vbNullString
For Each DictKey In MyDict
payload = IIf(Len(DictKey) = 0, WorksheetFunction.EncodeURL(DictKey) & "=" & WorksheetFunction.EncodeURL(MyDict(DictKey)), _
payload & "&" & WorksheetFunction.EncodeURL(DictKey) & "=" & WorksheetFunction.EncodeURL(MyDict(DictKey)))
Next DictKey
With oHttp
.Open "POST", Url, False
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36"
.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
.send (payload)
oHtml.body.innerHTML = .responseText
End With
GetUrl = oHtml.querySelector(".details").getAttribute("href")
Debug.Print oHtml.querySelector(".substanceNameLink ").innerText
Debug.Print oHtml.querySelector(".details").getAttribute("href")
End Function
If run this should return
Acetone https://echa.europa.eu/registration-dossier/-/registered-dossier/15460
Get Data uses the Url from geturl to return "DNEL" values:
Sub GetData()
'Start ECHA Search via XML HTTP Request
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim Route(1 To 3) As String
Route(1) = "sGeneralPopulationHazardViaInhalationRoute"
Route(2) = "sGeneralPopulationHazardViaDermalRoute"
Route(3) = "sGeneralPopulationHazardViaOralRoute"
XMLReq.Open "Get", GetUrl & "/7/1", False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLReq.responseText
'Loops through each element
For c = 1 To UBound(Route, 1)
Set Info = HTMLDoc.getElementById(Route(c))
Debug.Print Info.innerText
Set Info = HTMLDoc.getElementById(Route(c)).NextSibling.NextSibling.NextSibling
Set Data = Info.getElementsByTagName("dd")(0)
Debug.Print Data.innerText
Set Data = Info.getElementsByTagName("dd")(1)
Debug.Print Data.innerText
'Cells(r, c 2) = Data.innerText
Next c
End Sub
For Acetone in Cell(1,1) This should Return:
Acetone
https://echa.europa.eu/registration-dossier/-/registered-dossier/15460
General Population - Hazard via inhalation route
DNEL (Derived No Effect Level)
200 mg/m³
General Population - Hazard via dermal route
DNEL (Derived No Effect Level)
62 mg/kg bw/day
General Population - Hazard via oral route
DNEL (Derived No Effect Level)
62 mg/kg bw/day
Instead of just relying on Cell A1 however, I wish to have the entire code loop for each cell with a substance in columnA/ColumnB. So in this case the URL for Acetone is found and the corresponding data is then pulled then the same occurs for Oxydipropanol
.
Note in this image Substances can be looked up online using either the substance name, CAS number in columnB, or a combination of both.
Trying to connect the two modules, zo far I have only been able to get the geturl module to cycle through for each substance. I have also tried to combine both into 1 module but cant figure out how to correctly nest the for loops.
A quick google search states that you cant nest functions in vba. This makes me wonder if what I'm doing is even the right way to approach this. But I've seen similar things achieved In the past so I'm sure it's possible.
Note: If testing please use the example substances for testing. Using a random chemical say Benzene may result in an error as the tox profile for this substance doesn't exist. I still need to implement handling errors but this can be ignored for now.
I Will update you here with any further progress made, Thanks.
CodePudding user response:
This worked for me:
Sub PopulateExposures()
Dim url, rw As Range
Set rw = Sheets("data").Range("A1:E1") 'first row with inputs
Do While Application.CountA(rw) > 0
url = SubstanceUrl(rw.Cells(1).Value, rw.Cells(2).Value) 'get the URL
rw.Cells(3).Resize(1, 3).Value = ExposureData(url) 'get exposure data (as array) and add to row
Set rw = rw.Offset(1, 0) 'next substance
Loop
End Sub
Public Function SubstanceUrl(SubstanceName, CASNumber) As String
Const url = "https://echa.europa.eu/information-on-chemicals/registered-substances?" & _
"p_p_id=dissregisteredsubstances_WAR_dissregsubsportlet&p_p_lifecycle=1&" & _
"p_p_state=normal&p_p_mode=view&" & _
"__dissregisteredsubstances_WAR_dissregsubsportlet_javax.portlet.action=dissRegisteredSubstancesAction"
Dim oHTML, oHttp, MyDict, payload, DictKey, sep
Set oHTML = New HTMLDocument
Set oHttp = CreateObject("MSXML2.XMLHTTP")
Set MyDict = CreateObject("Scripting.Dictionary")
MyDict("_dissregisteredsubstances_WAR_dissregsubsportlet_disreg_name") = SubstanceName
MyDict("_dissregisteredsubstances_WAR_dissregsubsportlet_disreg_cas-number") = CASNumber
MyDict("_disssimplesearchhomepage_WAR_disssearchportlet_disclaimer") = "true"
MyDict("_disssimplesearchhomepage_WAR_disssearchportlet_disclaimerCheckbox") = "on"
payload = ""
For Each DictKey In MyDict
payload = payload & sep & DictKey & "=" & WorksheetFunction.EncodeURL(MyDict(DictKey))
sep = "&"
Next DictKey
With oHttp
.Open "POST", url, False
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36"
.setRequestHeader "Content-type", "application/x-www-form-urlencoded"
.send payload
oHTML.body.innerHTML = .responseText
End With
SubstanceUrl = oHTML.querySelector(".details").getAttribute("href")
End Function
Function ExposureData(urlToGet)
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As HTMLDocument
Dim Route(1 To 3) As String, Results(1 To 3) As String, c, Info, Data
Route(1) = "sGeneralPopulationHazardViaInhalationRoute"
Route(2) = "sGeneralPopulationHazardViaDermalRoute"
Route(3) = "sGeneralPopulationHazardViaOralRoute"
XMLReq.Open "Get", urlToGet & "/7/1", False
XMLReq.send
If XMLReq.Status <> 200 Then
Results(1) = "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
Else
Set HTMLDoc = New HTMLDocument
HTMLDoc.body.innerHTML = XMLReq.responseText
For c = 1 To UBound(Route, 1)
Set Info = HTMLDoc.getElementById(Route(c)).NextSibling.NextSibling.NextSibling
Results(c) = Info.getElementsByTagName("dd")(1).innerText
Next c
End If
ExposureData = Results
End Function