Home > Software engineering >  Parsing a scrollable = True html element in python
Parsing a scrollable = True html element in python

Time:11-09

I am trying to parse the Plans Pricing information present in the below webpage. For the tabs, Overview and Ratings, the data is present directly to scrape, however for the Plans Pricing, I am not able to render the html and scrape the tabular information.

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/esri.arcgis-m365?tab=PlansAndPrice

When I am using BeautifulSoup:

tabelements = soup.find('div', {'class': 'tabContent'})
for eachel in tabelements:
    print(eachel.text)

This gives just "loading..." as text

I am not really sure how to get the table content from this scrollable tabular information.

CodePudding user response:

The table is dynamically loaded from another source.

To get the data, try this:

import requests

headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"
}

url = "https://azuremarketplace.microsoft.com/view/appPricing/esri.arcgis-m365/us?ReviewsMyCommentsFilter=true"
skus = requests.get(url, headers=headers).json()["skus"]
for sku in skus:
    print(f'{sku["id"]}\n{sku["title"]}')
    price = sku["termPrices"][0]
    print(f'{price["value"]} {sku["currencyCode"]}/ {price["unit"]}')
    print("-" * 40)

Output:

annualpurchase-arcgism365-1creator
..Individual Contributor (1 power user)
500 USD/ Year
----------------------------------------
annualpurchase-arcgism365-1creator10viewer
..Small Team (1 power user, 10 readers)
1500 USD/ Year
----------------------------------------
annualpurchase-arcgism365-5creator50viewer
.Medium Team (5 power users, 50 readers)
7500 USD/ Year
----------------------------------------
annualpurchase-arcgism365-10creator100viewer
Large Team (10 power users, 100 readers)
14500 USD/ Year
----------------------------------------

If you want to rebuild the table, try this:

import requests
from tabulate import tabulate

headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36"
}

url = "https://azuremarketplace.microsoft.com/view/appPricing/esri.arcgis-m365/us?ReviewsMyCommentsFilter=true"
skus = requests.get(url, headers=headers).json()["skus"]
table_data = []
for sku in skus:
    price = f'{sku["currencyCode"]}{sku["startingPrice"]["value"]}'
    billing = "one time payment" \
        if sku["startingPrice"]["billingPlan"] is None \
        else sku["startingPrice"]["billingPlan"]
    table_data.append(
        [
            sku["title"],
            # sku["description"], uncomment to see the description
            "HERE GOES THE DESCRIPTION",
            f'{price}/{billing}',
            sku["startingPrice"]["unit"],
            f'${sku["termPrices"][0]["value"]}',
        ]
    )
table = tabulate(
    table_data,
    headers=[
        "Plan", "Description",
        "Price   payment options", "Billing term", "Subtotal"
    ],
)
print(table)

Output:

Plan                                      Description                Price   payment options    Billing term    Subtotal
----------------------------------------  -------------------------  -------------------------  --------------  ----------
..Individual Contributor (1 power user)   HERE GOES THE DESCRIPTION  USD500/one time payment    Year            $500
..Small Team (1 power user, 10 readers)   HERE GOES THE DESCRIPTION  USD1500/one time payment   Year            $1500
.Medium Team (5 power users, 50 readers)  HERE GOES THE DESCRIPTION  USD7500/one time payment   Year            $7500
Large Team (10 power users, 100 readers)  HERE GOES THE DESCRIPTION  USD14500/one time payment  Year            $14500
  • Related