I'm tring to remove the extra space and "rebtel.bootstrappedData" in the second alinea but for some reason it won't work.

This is my output

"welcome_offer_cuba.block_1_title":"SaveonrechargetoCuba","welcome_offer_cuba.block_1_cta":"Sendrecharge!","welcome_offer_cuba.block_1_cta_prebook":"Pre-bookRecarga","welcome_offer_cuba.block_1_footprint":"Offervalidfornewusersonly.","welcome_offer_cuba.block_2_key":"","welcome_offer_cuba.block_2_title":"Howtosendarecharge?","welcome_offer_cuba.block_2_content":"<ol><li>Simplyenterthenumberyou’dliketosendrechargeinthefieldabove.</li><li>Clickthe“{{buttonText}}”button.</li><li>CreateaRebtelaccountifyouhaven’talready.</li><li>Done!Yourfriendshouldreceivetherechargeshortly.</li></ol>","welcome_offer_cuba.block_3_title":"DownloadtheRebtelapp!","welcome_offer_cuba.block_3_content":"Sendno-feerechargeandenjoythebestcallingratestoCubainoneplace."},"canonical":{"string":"<linkrel=\"canonical\"href=\"https://www.rebtel.com/en/rates/\"/>"}};
            rebtel.bootstrappedData={"links":{"summary":{"collection":"country_links","ids":[null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null],"params":{"locale":"en"},"meta":{}},"data":[{"title":"A","links":[{"iso2":"AF","route":"afghanistan","name":"Afghanistan","url":"/en/rates/afghanistan/","callingCardsUrl":"/en/calling-cards/afghanistan/","popular":false},{"iso2":"AL","route":"albania","name":"Albania","url":"/en/rates/albania/

And this is the code I used:

import json
import requests
from bs4 import BeautifulSoup


url = "https://www.rebtel.com/en/rates/"

r = requests.get(url)

soup = BeautifulSoup(r.content, "html.parser")

x = range(132621, 132624)
script = soup.find_all("script")[4].text.strip()[38:]

print(script)

What should I add to "script" so it will remove the empty spaces?

CodePudding user response：

Original answer

You can change the definition of your script variable by :

script = soup.find_all("script")[4].text.replace("\t", "")[38:]

It will remove all tabulations on your text and so the alineas.

Edit after conversation in the comments

You can use the following code to extract the data in json :

import json
import requests
from bs4 import BeautifulSoup

url = "https://www.rebtel.com/en/rates/"
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
script = list(filter(None, soup.find_all("script")[4].text.replace("\t", "").split("\r\n")))
app_data = json.loads(script[1].replace("rebtel.appData = ", "")[:-1])
bootstrapped_data = json.loads(script[2].replace("rebtel.bootstrappedData = ", ""))

I extracted the lines of the script with split("\r\n") and get the wanted data from there.