How to get a link on a website using Python that updates dynamically?-CodePudding

I am trying to download the most recent zip file from the ERCOT Website (https://www.ercot.com/mp/data-products/compliance-and-disclosure/?id=NP3-965-ER). However, the link of the zip file has a doclookup id that changes everytime. The id is also populated dynamically. I have tried using beautifulsoup to get the link, but since it's being loaded dynamically it is not providing any links. Any feedback or solutions will be appreciated. enter image description here

CodePudding user response：

Using the exposed api:

import json

import pandas as pd
import pendulum
import requests


def get_document_id(type_id: int) -> int:
    url = (
        "https://www.ercot.com/misapp/servlets/IceDocListJsonWS?"
        f"reportTypeId={type_id}&"
        f"_={pendulum.now().format('X')}"
    )

    with requests.Session() as request:
        response = request.get(url, timeout=10)
    if response.status_code != 200:
        print(response.raise_for_status())

    data = json.loads(response.text)

    return pd.json_normalize(data=data["ListDocsByRptTypeRes"], record_path="DocumentList").head(1)["Document.DocID"].squeeze()


id_number = get_document_id(13052)
print(id_number)

869234127