Im attempting to get the Href information from the following site using Power query:
https://hpvchemicals.oecd.org/ui/SIDS_Details.aspx?id=fc1ced8a-ce14-45fa-b003-dfeda5e38075
As per the page I wish to obtain the href for the 50000.pdf link.
Inspecting the page this should be: handler.axd?id=fae8d1b1-406b-4287-8a05-f81aa1b16d3f
However attempting this in Power query this appears to be ommited from the text:
M Code:
let
Source = Table.FromColumns({Lines.FromBinary(Web.Contents("https://hpvchemicals.oecd.org/ui/SIDS_Details.aspx?id=fc1ced8a-ce14-45fa-b003-dfeda5e38075"))})
in
Source
My question is why does this happen? I dont think it can be solved (if so great) but Im still interested to understand whats going on here.
CodePudding user response:
It is using an iframe. Try this.
let
Source = Table.FromColumns({Lines.FromBinary(Web.Contents("https://hpvchemicals.oecd.org/ui/SidsOrganigrame.aspx?SIDSNo=fc1ced8a-ce14-45fa-b003-dfeda5e38075&id=000c31fa-483a-4e5b-a8bb-c26c3148e464&Key=1c143ab1-b132-4b57-b34d-559b07c845f2&Idx=0"))}),
#"Filtered Rows" = Table.SelectRows(Source, each [Column1] = " <img src=""images/FiletypeIcone/htm.png"" height=""16"" width=""16"" border=""0"" /> <a href=""http://www.chem.unep.ch/irptc/sids/OECDSIDS/sidspub.html"" target=""new"">SIAR published by UNEP</a><br /><img src=""images/FiletypeIcone/pdf.ico"" height=""16"" width=""16"" border=""0"" /> <a href=""handler.axd?id=5525377e-1442-43d0-8c76-f8cacfadf8bb"" target=""new"">FORMALDEHYDE_50000.pdf</a><br />")
in
#"Filtered Rows"