I've created a script to collect the breweries' names from this website using the requests module, but when I execute the script, it ends up getting nothing. I looked for the title in the page source and also in any undocumented APIs that are usually found through dev tools, but no luck.
import requests
from bs4 import BeautifulSoup
link = "https://www.brewersassociation.org/directories/breweries/"
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
}
res = requests.get(link,headers=headers)
soup = BeautifulSoup(res.text,"html.parser")
for item in soup.select(".company-content > h3[itemprop='name']"):
print(item.text)
CodePudding user response:
You can try:
import requests
import pandas as pd
url = 'https://www.brewersassociation.org/wp-content/themes/ba2019/json-store/breweries/breweries.json'
data = requests.get(url).json()
df = pd.DataFrame(data)
df = pd.concat([df, df.pop('BillingAddress').apply(pd.Series, dtype=object)], axis=1)
df.pop('attributes')
# print sample data, total length should be 26802 breweries:
print(df.head().to_markdown(index=False))
Prints:
Id | Name | Parent | Phone | Website | Brewery_Type__c | Is_Craft_Brewery__c | Voting_Member__c | Membership_Record_Item__c | Membership_Record_Paid_Through_Date__c | Membership_Record_Status__c | Account_Badges__c | city | country | countryCode | geocodeAccuracy | latitude | longitude | postalCode | state | stateCode | street |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0014x000012jyoHAAQ | Brewery in Planning - Monterrey | (811) 244-8078 | Brewery In Planning | False | False | Monterrey | Mexico | MX | Block | 25.6444 | -100.275 | 64850 | Tucan 362 | ||||||||
0014x000012jyoJAAQ | Sekinoichi-shuzo Co.,Ltd/Iwai Brewery | 81-191-21-1144 | www.sekinoichi.co.jp | Brewpub | False | False | Ichinoseki-city | Japan | JP | Address | 38.9314 | 141.132 | 021-0885 | 5-42 Tamuracho | |||||||
0014x000012jyoKAAQ | Selby (Middleborough) Brewery Ltd | 01757 702826 | False | False | Selby | United Kingdom | GB | Block | 53.7871 | -1.07141 | YO8 3LL | 131 Milgate | |||||||||
0014x000012jyoLAAQ | SENDERO BREWING COMPANY | www.senderobrewing.com | Brewery In Planning | False | False | Brewery Membership | 2019-10-31 | Expired | San Pedro Sula | Honduras | HN | City | 15.5039 | -88.0157 | 21102 | Los Alpes, Boulevard McKay | |||||
0014x000012jyoMAAQ | Ser Bhum Microbrewery | Micro | False | False | Brewery Membership | 2017-08-31 | Expired | Thimphu | Bhutan | BT | nan | nan | Hongtsho Hongtsho |