So I have this code but I am having issues when the data I am scraping has commas. I want it only show on the first column but when there's a comma, the data appears on the 2nd column. Is it possible to scrape and print it on the first column only of csv without using panda? Thanks
i = 1
for url in urls:
print(f'Scraping the URL no {i}')
i = 1
response = requests.get(url)
soup = BeautifulSoup(response.text,'html.parser')
links = []
for text in soup.find('div',class_='entry-content').find_all('div',class_='streak'):
link = text.a['href']
text = text.a.text
links.append(link)
with open("/Users/Rex/Desktop/data.csv", "a") as file_object:
file_object.write(text)
file_object.write("\n")
CodePudding user response:
CSV files have rules for escaping commas within a single column so that they are not mistakenly interpreted as a new column. This escaping can be applied automatically if you use the csv
module. You really only need to open the file once, so with a few more tweaks to your code
import csv
with open("/Users/Rex/Desktop/data.csv", "a", newline=None) as file_object:
csv_object = csv.writer(file_object)
i = 1
for url in urls:
print(f'Scraping the URL no {i}')
i = 1
response = requests.get(url)
soup = BeautifulSoup(response.text,'html.parser')
links = []
for text in soup.find('div',class_='entry-content').find_all('div',class_='streak'):
link = text.a['href']
text = text.a.text.strip()
# only record if we have text
if text:
links.append(link)
csv_object.writerow([text])
NOTE: This code is skipping links that do not have text.