Home > Back-end >  Problems scraping dynamic content with requests and BeautifulSoup
Problems scraping dynamic content with requests and BeautifulSoup

Time:12-12

I have tried to scrape the response of the form on the website https://www.languagesandnumbers.com/how-to-count-in-german/en/deu/ by trying to fill out the form and submitting it with requests and BeautifulSoup. After inspecting network-traffic of the submit, I found out that the post params are "numberz" and "lang". That's why I tried to post the following:

import requests
from bs4 import BeautifulSoup

with requests.Session() as session:
    response = session.post('https://www.languagesandnumbers.com/how-to-count-in-german/en/deu/', data={
        "numberz": "23",
        "lang": "deu"
    })

    soup = BeautifulSoup(response.content, "lxml")
    print(soup.find(id='words').get_text())

Unfortunately, the response is dynamic and not visible, so after submitting the form I always get the main page back without any text in the particular div which actually carries that response. Is there another way to scrape the response using requests and BeautifulSoup and not use selenium?

CodePudding user response:

You do not need BeautifulSoup but the correct url to get only the result of written number:

https://www.languagesandnumbers.com/ajax/en/

Cause it returns in this way ack:::dreiundzwanzig you hav to extract the string:

response.text.split(':')[-1]

Example

import requests

with requests.Session() as session:
    response = session.post('https://www.languagesandnumbers.com/ajax/en/', data={
        "numberz": "23",
        "lang": "deu"
    })
response.text.split(':')[-1]

Output

dreiundzwanzig
  • Related