How can I save these dictionaries' keys as columns and values as rows to CSV?
At first, I have gathered some URLs:
with open('unfolded.txt', 'r ') as file:
urllist = file.readlines()
urllist = [url.strip() for url in urllist]
print(urllist)
Output:
['https://starsunfolded.com/bhavna-ruparel/',
'https://starsunfolded.com/virat-kohli/',
'https://starsunfolded.com/amitabh-bachchan/',
'https://starsunfolded.com/manit-joura/',
'https://starsunfolded.com/akshay-kumar/',
'https://starsunfolded.com/kangana-ranaut/',
'https://starsunfolded.com/salman-khan/']
Scraped some info from those URLs
Input:
dicts={}
for url in urllist:
res = requests.get(url, headers=headers)
html = bs(res.text, 'html.parser')
table_rows= html.findAll('tr')
profession = ['Profession(s)', 'Profession']
full_name = ['Full Name', 'Real Name']
nick_name = ['Nickname', 'Nickname(s)']
height = ['Height (approx.)']
weight = ['Weight (approx.)']
def myinfo(list):
for x in list:
if x in dicts.keys():
return dicts[x]
dicts={}
for rows in table_rows:
alltds = rows.find_all('td')
if len(alltds) == 2:
dicts[alltds[0].text] = alltds[1].text.strip().replace('•
', '').replace('\n', ', ')
newdicts = {}
newdicts['Full Name'] = myinfo(full_name)
newdicts['Nick Name'] = myinfo(nick_name)
newdicts['profession'] = myinfo(profession)
newdicts['Height'] = myinfo(height)
newdicts['Weight'] = myinfo(weight)
fieldnames = ['Full Name', 'Nick Name', 'profession', 'Height',
'Weight']
print(newdicts)
Output
{'Full Name': None, 'Nick Name': None, 'profession': 'Model,
Actress', 'Height': 'in centimeters- 165 cm, in meters- 1.65 m,
in feet & inches- 5’ 5”', 'Weight': None}
{'Full Name': None, 'Nick Name': 'Chikoo, Run Machine',
'profession': 'Indian Cricketer (Batsman)', 'Height': 'in
centimeters- 175 cm, in meters- 1.75 m, in Feet Inches- 5’ 9”',
'Weight': None}
{'Full Name': 'Amitabh Harivansh Rai Shrivastava', 'Nick Name':
'Munna, Big B, Angry Young Man, AB Sr., Amith, Shahenshah of
Bollywood', 'profession': 'Actor, TV Host, Former Politician',
'Height': 'in centimeters- 188 cm, in meters- 1.88 m, in Feet
Inches- 6’ 2” [2]@SrBachchan', 'Weight': None}
{'Full Name': 'Manit Joura', 'Nick Name': 'Mani', 'profession':
'Actor', 'Height': 'in centimeters- 180 cm, in meters- 1.80 m,
in Feet Inches- 5’ 11”', 'Weight': 'in Kilograms- 75 kg, in
Pounds- 165 lbs'}
{'Full Name': 'Rajiv Hari Om Bhatia', 'Nick Name': 'Akki, Raju,
Mac , Khiladi Kumar', 'profession': 'Actor , Producer', 'Height':
'in centimeters- 185 cm, in meters- 1.85 m, in feet inches- 6’
1”', 'Weight': 'in kilograms- 80 kg, in pounds- 176 lbs'}
{'Full Name': 'Kangana Amardeep Ranaut', 'Nick Name': 'Arshad,
OTA (One Take Actor)', 'profession': 'Actress, Writer, Director',
'Height': 'in centimeters- 168 cm, in meters- 1.68 m, in feet
inches- 5’ 6”', 'Weight': 'in kilograms- 55 kg, in pounds- 121
lbs'}
{'Full Name': 'Abdul Rashid Salim Salman Khan [1]Times of India',
'Nick Name': 'Sallu, Bhaijan [2]Indiatvnews.com', 'profession':
'Actor, Producer, Entrepreneur', 'Height': 'in centimeters- 174
cm, in meters- 1.74 m, in feet inches- 5’ 8”', 'Weight': 'in
kilograms- 75 kg, in pounds- 165 lbs'}
CodePudding user response:
I see you are looping over a set of URLs, downloading the HTML for each URL, scraping that HTML and making into a dict.
You could just collect each dict, and arrive at something like all_dicts
, as I have shown:
import csv
all_dicts = []
# for url in urllist:
# request url
# scrape html from url
# somehow you convert scraped html to a dict, let's call it new_dict
# all_dicts.append(new_dict)
# And, now you have all the dictionaries, and it looks something like this:
all_dicts = [
{'Full Name': None, 'Nick Name': None, 'profession': 'Model, Actress',
'Height': 'in centimeters- 165 cm, in meters- 1.65 m, in feet & inches- 5’ 5”', 'Weight': None},
{'Full Name': None, 'Nick Name': 'Chikoo, Run Machine',
'profession': 'Indian Cricketer (Batsman)', 'Height': 'in centimeters- 175 cm, in meters- 1.75 m, in Feet Inches- 5’ 9”', 'Weight': None},
{'Full Name': 'Amitabh Harivansh Rai Shrivastava', 'Nick Name': 'Munna, Big B, Angry Young Man, AB Sr., Amith, Shahenshah of Bollywood',
'profession': 'Actor, TV Host, Former Politician', 'Height': 'in centimeters- 188 cm, in meters- 1.88 m, in Feet Inches- 6’ 2” [2]@SrBachchan', 'Weight': None},
# ... more of your dicts
]
# And then you move on to writing all_dicts
with open('my.csv', 'w', newline='') as f:
writer = csv.DictWriter(f, fieldnames=all_dicts[0].keys())
writer.writeheader()
writer.writerows(all_dicts)