when I convert lists to the dictionary, only one element is stored in the dictionary, and I don't know why. this is my code. I used BeautifuleSoup in web scraping and store in lists. finally, I want to clean them and store them in JSON file
data = []
qa_dict={}
q_text=[]
q_username=[]
q_time=[]
r_text=[]
for i in range(1,2):
print(i)
url='https://drakhosravi.com/faq/?cat=all&hpage=' str(i)
response1 = requests.get(url).content.decode()
soup = BeautifulSoup(response1,'html.parser')
for s in soup.select('span.hamyar-comment-person-name') :
if(s.text!='دکتر آرزو خسروی'):
q_username.append(s.text.strip())
for times in soup.select('div.comment-header'):
for s in times.select('span.hamyar-comment-date') :
q_time.append(s.text.strip())
question=[]
for comment in soup.select('div.comment-body'):
question.append(comment.text.strip())
q_text=question[0::2]
for r in soup.select('ol.faq-comment_replies'):
for head in r.select('li'):
for t in head.select('div.comment-body'):
r_text.append(t.text.strip())
for username,qtime,qtxt,rtxt in zip(q_username,q_time,q_text,r_text):
qa_dict= {'username':username,'question_time':qtime,'question_text':qtxt,'url':url,'respond_text':rtxt,'responder_profile_url':'https://drakhosravi.com/about-us'
}
data.append(qa_dict)
with open('drakhosravi2.json', 'w ', encoding="utf-8") as handle:
json.dump(qa_dict, handle, indent=4, ensure_ascii=False)
CodePudding user response:
Your code is difficult to follow, but your question seems quite explicit, so here's a more generic answer: When you need to create dictionaries from lists, I assume you have the list of keys in a list and the list of matching values in another one:
keys_sample = ["left", "right", "up", "down"]
values_sample = ["one", "two", 0, None]
# dictionary constructor takes pairs, which are the output of zip:
together = dict(zip(keys_sample, values_sample))
And the result should look like:
together = {'left': 'one', 'right': 'two', 'up': 0, 'down': None}
The most important thing to remember here is that both lists must have the same length and iterating over in they natural order will match the key value pairs.
CodePudding user response:
I don't fully understand your question but you seem to want qa_dict
to have lists as values. But that's because qa_dict
is updated in every iteration without the next values getting saved. Change this
for username,qtime,qtxt,rtxt in zip(q_username,q_time,q_text,r_text):
qa_dict={'username':username,'question_time':qtime,'question_text':qtxt,'url':url,'respond_text':rtxt,'responder_profile_url':'https://drakhosravi.com/about-us'}
to
qa_dict={'username':q_username,'question_time':q_time,'question_text':q_txt,'url':url,'respond_text':r_txt,'responder_profile_url':'https://drakhosravi.com/about-us'}
In other words, no need for loop. Since each of these is a list, you'll now have lists in dictionary qa_dict
as values.
Or if you want to create a list of dictionaries, bring data.append(qa_dict)
inside the for-loop.