I'm trying to scrape a web page with pagination.
Code:
from requests_html import _URL, HTMLSession
from bs4 import BeautifulSoup
for page in range(1,6):
s = HTMLSession()
url = 'https://www.lazada.com.ph/catalog/?q=laptop'
url = '&page={}'
r=s.get(url.format(page))
print(url)
Output:
https://www.lazada.com.ph/catalog/?q=laptop&page={}
https://www.lazada.com.ph/catalog/?q=laptop&page={}
https://www.lazada.com.ph/catalog/?q=laptop&page={}
https://www.lazada.com.ph/catalog/?q=laptop&page={}
https://www.lazada.com.ph/catalog/?q=laptop&page={}
Expectation:
https://www.lazada.com.ph/catalog/?q=laptop&page={1}
https://www.lazada.com.ph/catalog/?q=laptop&page={2}
https://www.lazada.com.ph/catalog/?q=laptop&page={3}
https://www.lazada.com.ph/catalog/?q=laptop&page={4}
https://www.lazada.com.ph/catalog/?q=laptop&page={5}
I'm still new and learning python, please help me to get the my expected result. Thanks in advance.
CodePudding user response:
If your python version supports f strings...
from requests_html import _URL, HTMLSession
from bs4 import BeautifulSoup
for page in range(1,6):
s = HTMLSession()
url = f'https://www.lazada.com.ph/catalog/?q=laptop&page={page}'
r = s.get(url)
print(url)
CodePudding user response:
The url doesn't get modified, try:
from requests_html import _URL, HTMLSession
from bs4 import BeautifulSoup
for page in range(1,6):
s = HTMLSession()
url = 'https://www.lazada.com.ph/catalog/?q=laptop'
url = '&page={}'
url = url.format(page)
r = s.get(url)
print(url)
Or even better:
from requests_html import _URL, HTMLSession
from bs4 import BeautifulSoup
for page in range(1,6):
s = HTMLSession()
url = 'https://www.lazada.com.ph/catalog/?q=laptop&page={}'.format(page)
r = s.get(url)
print(url)