No login required to obtain this file. The website is "https://rigcount.bakerhughes.com/na-rig-count" and the file is the "North America Rotary Rig Count Pivot Table (Feb 2011 - Current)". Right-click, copy link, gives the current file path, "https://rigcount.bakerhughes.com/static-files/cb205922-552b-4f88-954f-665a2b1c731f". Inspecting the file element shows the Xpath with the file path.
I want to create a power query connection that will update the file path name when I perform a weekly scheduled refresh.
I tried using Google Sheets with the formula:
=IMPORTXML(website,Xpath) =IMPORTXML("https://rigcount.bakerhughes.com","/html/body/div[2]/div/div/div/div[2]/article/div/div[2]/div/div/div[2]/div/div/div/table/tbody/tr[2]/td[2]/div/div/article/div/div/div/div/span[1]/a")
Result was an Error: Could not fetch url: https://rigcount.bakerhughes.com
CodePudding user response:
With cURL and a XPath processor like Xidel (available here for Linux/Mac/Win):
curl -s -L 'https://rigcount.bakerhughes.com/na-rig-count' |
xidel --xpath '//a[text()="North America Rotary Rig Count Pivot Table (Feb 2011 - Current)"]/@href'
/static-files/cb205922-552b-4f88-954f-665a2b1c731f
CodePudding user response:
let
Source = Web.BrowserContents("https://rigcount.bakerhughes.com/na-rig-count"),
#"Open links" = Html.Table(
Source
, {{"link text", "a"}, {"url", "a", each[Attributes][href]?}}
, [RowSelector = "a"]
),
#"Locate Current target URL" = Table.SelectRows(
#"Open links"
, each
Text.Contains([#"link text"], "North America Rotary Rig Count Pivot Table")
and Text.Contains([#"link text"], "Current")
)
in
#"Locate Current target URL"