Using request to scrape a webpage does not return all the data-CodePudding

I am using the python requests package to scrape a webpage. This is the code:

import requests
from bs4 import BeautifulSoup

# Configure Settings
url = "https://mangaabyss.com/read/"
comic = "the-god-of-pro-wrestling"

# Run Scraper
page = requests.get(url   comic   "/")

soup = BeautifulSoup(page.content, 'html.parser')

print(soup.prettify())

The url it uses is "https://mangaabyss.com/read/the-god-of-pro-wrestling/" But in the output of soup, I only get the first div and no other child elements that are inside it. This is the output I get:

<!DOCTYPE html>
<html lang="en">
 <head>
  <meta charset="utf-8"/>
  <link href="/favicon.ico" rel="icon"/>
  <meta content="width=device-width,initial-scale=1,minimum-scale=1,maximum-scale=1,viewport-fit=cover" name="viewport"/>
  <meta content="#250339" name="theme-color"/>
  <title>
   MANGA ABYSS
  </title>
  <script crossorigin="" src="/assets/index.f4dc01fb.js" type="module">
  </script>
  <link href="/assets/index.9b4eb8b4.css" rel="stylesheet"/>
 </head>
 <body>
  <div id="manga-mobile-app">
  </div>
 </body>
</html>

The content that I want to scrape is way deep inside that div I am looking to extract the number of chapters. This is the selector for it:

#manga-mobile-app > div > div.comic-info-component > div.page-normal.with-margin > div.comic-deatil-box.tab-content.a-move-in-right > div.comic-episodes > div.episode-header.f-clear > div.f-left > span

Can anyone help me where I'm going wrong?

CodePudding user response：

The data is loaded from external URL so beautifulsoup doesn't see it. You can use requests module to simulate this call:

import json
import requests

slug = "the-god-of-pro-wrestling"
url = "https://mangaabyss.com/circinus/Manga.Abyss.v1/ComicDetail?slug="

data = requests.get(url   slug).json()

# uncomment to print all data:
# print(json.dumps(data, indent=4))

for ch in data["data"]["chapters"]:
    print(
        ch["chapter_name"],
        "https://mangaabyss.com/read/{}/{}".format(slug, ch["chapter_slug"]),
    )

Prints:

...

Chapter 4 https://mangaabyss.com/read/the-god-of-pro-wrestling/chapter-4
Chapter 3 https://mangaabyss.com/read/the-god-of-pro-wrestling/chapter-3
Chapter 2 https://mangaabyss.com/read/the-god-of-pro-wrestling/chapter-2
Chapter 1 https://mangaabyss.com/read/the-god-of-pro-wrestling/chapter-1