It seems that I cannot encode the character '\u015f'
(letter s with cedilla). Please could someone help?
from selenium import webdriver
import time
with open('Violators_UNGC1.csv', 'w',encoding='utf-8'.replace(u"\u015f", "ş")) as file:
file.write("Participants; Sectors; Countries; Expelled \n")
driver=webdriver.Chrome(executable_path='C:\webdrivers\chromedriver.exe')
driver.get('https://www.unglobalcompact.org/participation/report/cop/create-and-submit/expelled?page=1&per_page=250')
driver.maximize_window()
time.sleep(2)
for k in range(150):
Participants = driver.find_elements("xpath",'//td[@]/a')
Sectors = driver.find_elements("xpath",'//td[@]')
Countries = driver.find_elements("xpath",'//td[@]')
Expelled = driver.find_elements("xpath",'//td[@]')
time.sleep(1)
with open('Violators_UNGC1.csv', 'a') as file:
for i in range(len(Participants)):
file.write(Participants[i].text ";" Sectors[i].text ";" Countries[i].text ";" Expelled[i].text "\n")
driver.close()
and I get an error message as per the below:
UnicodeEncodeError
Traceback (most recent call last) Cell In [15], line 28
26 with open('Violators_UNGC1.csv', 'a') as file:
27 for i in range(len(Participants)):
---> 28 file.write(Participants[i].text ";" Sectors[i].text ";" Countries[i].text ";" Expelled[i].text "\n")
30 driver.close() File ~\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py:19, in IncrementalEncoder.encode(self, input, final)
18 def encode(self, input, final=False):
---> 19 return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u015f' in position 32: character maps to <undefined>
Thank you all !
CodePudding user response:
As mentioned in comments, the default encoding of open
is not fixed and should be declared explicitly. UTF-8 works for all Unicode characters. I also suggest opening the file once instead of re-opening it for each row write, and to use the csv
module to write CSV files:
import csv
with open('Violators_UNGC1.csv', 'w', encoding='utf-8') as file:
w = csv.writer(file, delimiter=';')
w.writerow(['Participants','Sectors','Countries','Expelled'])
# Fake data for demonstration
Participants = 'oneş','twoş','threeş'
Sectors = 'sec1','sec2','sec3'
Countries = 'USA','Germany','France'
Expelled = 'A','B','C'
# zip returns all the first items in each group, then the 2nd, etc.
for row in zip(Participants, Sectors, Countries, Expelled):
w.writerow(row)
Output file:
Participants;Sectors;Countries;Expelled
oneş;sec1;USA;A
twoş;sec2;Germany;B
threeş;sec3;France;C