I can't convert string type to dictionary using json.loads in python-CodePudding

The target is to extract a MP4 video link on MLB website.

url ="https://www.mlb.com/video/jeremy-pena-s-solo-homer?t=most-popular"
content = requests.get(url).text

I have found the target dict.

soup = BeautifulSoup(content,"lxml")

all_script_label = soup.find_all(name ="script")

target = all_script_label[20].text.split("\n")[1].split("=")[1]

But I can't turn the target into dict type with json.loads, it's still a string.

json_ob = json.loads(target)
print(type(json_ob))

Which step I did wrong?

I have tried ast.literal_eval method but it doesn't work too.

CodePudding user response：

You can apply json.loads second time to convert the str to dict:

import re
import json
import requests
from bs4 import BeautifulSoup

url = "https://www.mlb.com/video/jeremy-pena-s-solo-homer?t=most-popular"
content = requests.get(url).text
soup = BeautifulSoup(content, "lxml")
all_script_label = soup.find_all(name="script")
target = all_script_label[20].text


data = re.search(r"window\.__VIDEO_INIT_STATE__ = (.*)", target).group(1)
data = json.loads(json.loads(data))

print(type(data))

Prints:

<class 'dict'>