Home > OS >  Encoding Error - charmap' codec can't encode character '\u015f'
Encoding Error - charmap' codec can't encode character '\u015f'

Time:12-22

It seems that I cannot encode the character '\u015f' (letter s with cedilla). Please could someone help?

from selenium import webdriver
import time

with open('Violators_UNGC1.csv', 'w',encoding='utf-8'.replace(u"\u015f", "ş")) as file:
    file.write("Participants; Sectors; Countries; Expelled \n")

driver=webdriver.Chrome(executable_path='C:\webdrivers\chromedriver.exe')

driver.get('https://www.unglobalcompact.org/participation/report/cop/create-and-submit/expelled?page=1&per_page=250')

driver.maximize_window()
time.sleep(2)

for k in range(150):
    
    Participants = driver.find_elements("xpath",'//td[@]/a')  
    
    Sectors = driver.find_elements("xpath",'//td[@]')
    
    Countries = driver.find_elements("xpath",'//td[@]')
    
    Expelled = driver.find_elements("xpath",'//td[@]') 
    
    time.sleep(1)
    
    with open('Violators_UNGC1.csv', 'a') as file:
        for i in range(len(Participants)):
            file.write(Participants[i].text   ";"   Sectors[i].text   ";"   Countries[i].text   ";"   Expelled[i].text   "\n")
            
driver.close()

and I get an error message as per the below:

UnicodeEncodeError
Traceback (most recent call last) Cell In [15], line 28
     26     with open('Violators_UNGC1.csv', 'a') as file:
     27         for i in range(len(Participants)):
---> 28             file.write(Participants[i].text   ";"   Sectors[i].text   ";"   Countries[i].text   ";"   Expelled[i].text   "\n")
     30 driver.close() File ~\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py:19, in IncrementalEncoder.encode(self, input, final)
     18 def encode(self, input, final=False):
---> 19     return codecs.charmap_encode(input,self.errors,encoding_table)[0]

UnicodeEncodeError: 'charmap' codec can't encode character '\u015f' in position 32: character maps to <undefined>

Thank you all !

CodePudding user response:

As mentioned in comments, the default encoding of open is not fixed and should be declared explicitly. UTF-8 works for all Unicode characters. I also suggest opening the file once instead of re-opening it for each row write, and to use the csv module to write CSV files:

import csv

with open('Violators_UNGC1.csv', 'w', encoding='utf-8') as file:
    w = csv.writer(file, delimiter=';')
    w.writerow(['Participants','Sectors','Countries','Expelled'])

    # Fake data for demonstration
    Participants = 'oneş','twoş','threeş'
    Sectors = 'sec1','sec2','sec3'
    Countries = 'USA','Germany','France'
    Expelled = 'A','B','C'

    # zip returns all the first items in each group, then the 2nd, etc.
    for row in zip(Participants, Sectors, Countries, Expelled):
        w.writerow(row)

Output file:

Participants;Sectors;Countries;Expelled
oneş;sec1;USA;A
twoş;sec2;Germany;B
threeş;sec3;France;C
  • Related