Got 20 rows that's good, by using two lists, but still first loop keep appending itself
I want to have list of authors in one cell, for that I am using .join() command.
A little into to my code and what I am trying to accomplish:
Main link is a list of 20 items and each item has a list of 4-5 authors. First I want to iterate over links and then over the each of its items to get list of authors in one cell of csv.
It's nightmare for me. I have spent days in figuring out the answer, hopefully someone will help and understand the problem. Ask for more information, thank you. Output is attached below:
from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome()
site = 'https://www.goodreads.com/search?q=chughtai&qid=WzdWh5nG8z'
driver.get(site)
driver.maximize_window()
authors = []
auth = []
main = driver.find_elements_by_tag_name('tr')
for i in main:
con = i.find_elements_by_xpath('.//div[@]')
for n in con:
authors.append(n.find_element_by_xpath('.//a[@]/span').text)
one_cell = ', '.join(authors)
auth.append(one_cell)
a = {'Author Names': one_cell}
df = pd.DataFrame.from_dict(a, orient='index')
df = df.transpose()
df.to_csv("only_names.csv", index=False)
print(df)
CodePudding user response:
Seems that your problem is the author list does not reset to empty before you parse a new item. One way to reset it would be to move your authors = []
from its current position to the line right after for i in main:
. Then you will get a new, empty list for each item.
Another, non-critical, suggestion is to get your one_cell = ', '.join(authors)
outside of the current, inner loop but still before auth.append(one_cell)
. You only need to do both lines once for each i
.
UPDATE:
To show my 2nd suggestion:
for i in main:
authors = []
con = i.find_elements_by_xpath('.//div[@]')
for n in con:
authors.append(n.find_element_by_xpath('.//a[@]/span').text)
one_cell = ', '.join(authors)
auth.append(one_cell)