Home > Software design >  How to append data in data frame using beautiful soup
How to append data in data frame using beautiful soup

Time:11-11

import requests
from bs4 import BeautifulSoup
import pandas as pd
baseurl='https://locations.atipt.com/'
headers ={
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
r =requests.get('https://locations.atipt.com/al')
soup=BeautifulSoup(r.content, 'html.parser')
tra = soup.find_all('ul',class_='list-unstyled')
productlinks=[]
for links in tra:
    for link in links.find_all('a',href=True):
        comp=baseurl link['href']
        productlinks.append(comp)
temp=[]
for link in productlinks:
    r =requests.get(link,headers=headers)
    soup=BeautifulSoup(r.content, 'html.parser')
    tag=soup.find_all('div',class_='listing content-card')
    for pro in tag:
        for tup in pro.find_all('p'):
            temp.append([text for text in tup.stripped_strings])
            

df = pd.DataFrame(temp)
print(df)

This is the output I get

9256 Parkway E Ste A
Birmingham, 
Alabama 35206

but I doesn't how to give the name in data frame I give name address to 9256 Parkway ESte A and City to Birmingham and state to ALabama 35206 if it is possible that kindly help in these matter

CodePudding user response:

temp=[]
for link in productlinks:
    r =requests.get(link,headers=headers)
    soup=BeautifulSoup(r.content, 'html.parser')
    tag=soup.find_all('div',class_='listing content-card')
    for pro in tag:
        data=[tup.text for tup in pro.find_all('p')]
        address="".join(data[:2])
        splitdata=data[2].split(",")
        city=splitdata[0]
        splitsecond=splitdata[-1].split("\xa0")
        state=splitsecond[0]
        postalcode=splitsecond[-1]
        temp.append([address,city,state])
import pandas as pd
df=pd.DataFrame(temp,columns=["Address","City","State"])
df

Output:

        Address         City      State    Postalcode
0   634 1st Street NSte 100 Alabaster   AL  35007
1   9256 Parkway ESte A Birmingham  AL  35206
....

If you want to add call details just add this statement after postalcode

callNumber=pro.find("span",class_="directory-phone").get_text(strip=True).split("\n")[-1].lstrip()

and append this to temp list

  • Related