I'm writing a script that should: 1) open a CSV file 2) loop through some eBay results data 3) write details from this data to the same file if it matches a search term and if it's not already present, but it has a few issues:
- The headers are not written if not present, so I'm manually adding them. I was previously using
DictWriter()
withwriteheader()
, butfieldnames
was required, so they were written each time - The
if result_id != existing_result_id:
condition doesn't work in a nested loop, so the rows aren't written.if any(x in result_title.upper() for x in search_terms):
does currently work, but I'm not sure how to combine them before writing the rows
import csv
import smtplib
from requests_html import HTMLSession
SEARCH_TERMS = ['one', 'two']
session = HTMLSession()
response = session.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=blink 182&_sacat=0&_sop=10&rt=nc&LH_PrefLoc=2')
results = response.html.find('ul.srp-results li.s-item')
with open('ebay-results.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader) # Skip headers
with open('ebay-results.csv', 'a') as csv_file_updated:
# field_names = ['Title', 'Link', 'ID']
csv_writer = csv.writer(csv_file_updated)
search_terms = [x.upper() for x in SEARCH_TERMS]
for result in results:
result_title = result.find('h3.s-item__title', first=True).text.replace('New Listing', '')
result_link = result.find('a.s-item__link', first=True).attrs['href'].split('?')[0]
result_id = result_link.split('/')[4].split('?')[0]
result_info = [result_title, result_link, result_id]
if any(x in result_title.upper() for x in search_terms):
for line in csv_reader:
existing_result_id = line[2]
if result_id != existing_result_id:
csv_writer.writerow(result_info)
send_email(search_terms, result_link)
CodePudding user response:
There's a few issues with your code:
- you create a file handle to read a file and then another file handle to append to the same file; that's dodgy at best, do you expect the reader to read lines you've appended? What's the purpose?
- you exhaust the reader with
for line in csv_reader:
for every result, which will only work once, since you don'tseek
the start of the file before rereading; however, why reread the file over and over anyway? - standard Python indentation is 4 deep, you would do well to follow convention, as not doing so just makes your code harder to read, maintain and reuse.
It appears you simply want to write results for identifiers you haven't written results for previously to the file. Why not do this:
- read the results once, keeping the identifiers for a quick lookup
- then loop over the results, writing results with new identifiers
This is likely what you were after:
import csv
from requests_html import HTMLSession
SEARCH_TERMS = ['one', 'two']
session = HTMLSession()
response = session.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=blink 182&_sacat=0&_sop=10&rt=nc&LH_PrefLoc=2')
results = response.html.find('ul.srp-results li.s-item')
with open('ebay-results.csv', 'r ', newline='') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader) # Skip headers
existing_ids = [rec[2] for rec in csv_reader]
# field_names = ['Title', 'Link', 'ID']
csv_writer = csv.writer(csv_file)
search_terms = [x.upper() for x in SEARCH_TERMS]
for result in results:
result_title = result.find('h3.s-item__title', first=True).text.replace('New Listing', '')
result_link = result.find('a.s-item__link', first=True).attrs['href'].split('?')[0]
result_id = result_link.split('/')[4].split('?')[0]
result_info = [result_title, result_link, result_id]
if any(x in result_title.upper() for x in search_terms):
if result_id not in existing_ids:
csv_writer.writerow(result_info)
# not implemented, but not relevant to the question
# send_email(search_terms, result_link)
You asked how to deal with the headers, another issue might be that the .csv doesn't exist when first running. Something like this would solve both:
from pathlib import Path
# create file if it doesn't exist, write header to it
if not Path('ebay-results.csv').is_file():
with open('ebay-results.csv', 'wb') as csv_file:
csv_file.write('Title,Link,ID\n')
# reopen the file for reading, which now must exist, and has a header
with open('ebay-results.csv', 'r ', newline='') as csv_file:
csv_reader = csv.reader(csv_file)
field_names = next(csv_reader)