I use Power Query to scrape a webpage that has multiple tables on it. The tables can move around so one week it may be the second table, the next the 3rd, then back to the 2nd the next week. The table data does have the same text in the first column so I can select it if I can find the entry.
My code is as follows. The issue is the Source{2} can be a different number based on the actual page layout.
let
Source = Web.Page(Web.Contents("https://www.eia.gov/naturalgas/weekly/#tabs-supply-1")),
Data0 = Source{2}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type","U.S. natural gas supply - ","",Replacer.ReplaceText,{"Column1"}),
RecRepl = Table.ReplaceValue(#"Replaced Value",null, each _[Column1], Replacer.ReplaceValue,{"Column2"}),
#"Removed Columns" = Table.RemoveColumns(RecRepl,{"Column1"}),
#"Removed Top Rows" = Table.Skip(#"Removed Columns",1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true])
in
#"Promoted Headers"
How can I search the table in the Data column to find the row I need?
Thanks!!!
I've tried a variety of steps but have been unable to find the table.
CodePudding user response:
try this
This code looks for the phrase Henry Hub in column zero of all the tables in the Data column, and figures out which one has it. It then filters for true and pulls the contents (of the first match). You can adjust what you are looking for in terms of column to search, text to search for, and take it from there
let Source = Web.Page(Web.Contents("https://www.eia.gov/naturalgas/weekly/#tabs-supply-1")),
#"Added Custom" = Table.AddColumn(Source, "Custom", each List.Contains(Table.Column([Data],Table.ColumnNames([Data]){0}),"Henry Hub")),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each [Custom] = true),
Data0 = #"Filtered Rows"{0}[Data]
in Data0