I am trying to webscrape some data from https://il.water.usgs.gov/gmaps/precip/. I only want specific cells from the row called "RAIN GAGE AT PING TOM PARK AT CHICAGO, IL. Only the cells containing the 1, 3, and 12 hour predictions for rain. What should I fix?
import pandas as pd
url = "https://il.water.usgs.gov/gmaps/precip/"
df = pd.read_html(url, flavor="bs4")[0]
print(df.loc[df[0] == "RAIN GAGE AT PING TOM PARK AT CHICAGO, IL"])
CodePudding user response:
Data is dynamically retrieved from another endpoint returning JSON. You could write a function calling that endpoint and pass in location and desired hours
def get_precipitation(location:str, hrs:list):
import requests
url = "https://il.water.usgs.gov/gmaps/precip/data/rainfall_outIL_WSr2.json"
r = requests.get('https://il.water.usgs.gov/gmaps/precip/data/rainfall_outIL_WSr2.json').json()
data = [i for i in r['value']['items'] if i['title'] == location][0]
for k,v in data.items():
if k in hrs:
print(f'{k}={v}')
if __name__ == "__main__":
location = "RAIN GAGE AT PING TOM PARK AT CHICAGO, IL"
hrs = ['precip1hrvalue', 'precip3hrvalue', 'precip12hrvalue']
get_precipitation(location, hrs)