I want to scrap Name, Phone Number and email from the webpage but it seems like the entire detail is in dictionary which is inside a dictionary someone please correct me I am confused how I am going to extract these values in there particular column. Here is the code
import requests
from bs4 import BeautifulSoup
from csv import writer
url ='https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false'
R = requests.get(url)
soup = BeautifulSoup(R.text, 'html.parser')
print(soup)
with open('school.csv', 'a', encoding='utf8', newline ='') as f:
thewriter = writer(f)
header = ['Name', 'Location', 'Phone Number', 'Email' ]
thewriter.writerow(header)
thewriter.writerow(soup)
CodePudding user response:
It do not need BeautifulSoup
as mentioned simply requets the api and transform JSON via csv.DictWriter
to CSV.
Example
import requests, csv
url ='https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false'
data = requests.get(url).json()['items']
data
with open('my.csv', 'w', newline='') as output_file:
dict_writer = csv.DictWriter(output_file, data[0].keys())
dict_writer.writeheader()
dict_writer.writerows(data)
EDIT
As mentioned by Barry the Platipus there is also one and more methods to go with pandas
:
import pandas as pd
pd.json_normalize(
pd.read_json('https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false')['items']
).to_csv('my.csv', index=False)
or
pd.DataFrame(
pd.read_json('https://mainapi.dadeschools.net/api/v1/employees?limit=10&skip=0&sortDesc=false')['items']\
.values.tolist()
).to_csv('my.csv', index=False)