I am trying to get names of authors of a book from a books website. I get the names in one column but multiple rows. I want all names in one cell of csv. Following is my full code
from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome()
site = 'https://www.goodreads.com/book/show/50148349-manto-and-chughtai?from_search=true&from_srp=true&qid=ZARMElvyyt&rank=3'
driver.get(site)
authors = [ ]
names = driver.find_elements_by_xpath('//div[@]')
for name in names:
authors.append(name.find_element_by_xpath('.//a[@]').text)
df = pd.DataFrame({'Author Names': authors})
df.to_csv("Authors_list.csv", index=False)
print(df)
Here is my output, I am getting, I want all these four names in one cell
CodePudding user response:
You could try this.
authors = ','.join(df['authors'].to_list())
with open('mycsv.csv', 'w', newline='') as myfile:
myfile.write(authors)
CodePudding user response:
I don't have Selenium installed and setup as to test it.
Could you please try this small tweak?
from selenium import webdriver
import pandas as pd
driver = webdriver.Chrome()
site = 'https://www.goodreads.com/book/show/50148349-manto-and-chughtai?from_search=true&from_srp=true&qid=ZARMElvyyt&rank=3'
driver.get(site)
authors = [ ]
names = driver.find_elements_by_xpath('//div[@]')
for name in names:
authors.append(name.find_element_by_xpath('.//a[@]').text)
authors_in_one_cell = ', '.join(authors)
df = pd.DataFrame({'Author Names': authors_in_one_cell})
df.to_csv("Authors_list.csv", index=False)
print(df)
CodePudding user response:
Pandas dataframe.to_csv() method will write each list element with a newline from the looks of it. You're unintentionally accomplishing what this stack overflow question was trying to do here.
Try making authors a string, and appending each new author to that string instead of a list. As long as there's no commas in the author's names, all values will appear in the same cell after opening in office.