Hi I am trying to use csv
library to convert my CSV file into a new one.
The code that I wrote is the following:
import csv
import re
file_read=r'C:\Users\Comarch\Desktop\Test.csv'
file_write=r'C:\Users\Comarch\Desktop\Test_new.csv'
def find_txt_in_parentheses(cell_txt):
pattern = r'\(. \)'
return set(re.findall(pattern, cell_txt))
with open(file_write, 'w', encoding='utf-8-sig') as file_w:
csv_writer = csv.writer(file_w, lineterminator="\n")
with open(file_read, 'r',encoding='utf-8-sig') as file_r:
csv_reader = csv.reader(file_r)
for row in csv_reader:
cell_txt = row[0]
txt_in_parentheses = find_txt_in_parentheses(cell_txt)
if len(txt_in_parentheses) == 1:
txt_in_parentheses = txt_in_parentheses.pop()
cell_txt_new = cell_txt.replace(' ' txt_in_parentheses,'')
cell_txt_new = txt_in_parentheses '\n' cell_txt_new
row[0] = cell_txt_new
csv_writer.writerow(row)
The only problem is that in the resulting file (Test_new.csv file), I have CRLF
instead of LF
.
Here is a sample image of:
- read file on the left
- write file on the right:
And as a result when I copy the csv column into Google docs Excel file I am getting a blank line after each row with CRLF
.
Is it possible to write my code with the use of csv
library so that LF
is left inside a cell instead of CRLF
.
CodePudding user response:
From the documentation of csv.reader
If
csvfile
is a file object, it should be opened withnewline=''
1
[...]Footnotes
1(1,2) If
newline=''
is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use\r\n
linendings on write an extra\r
will be added. It should always be safe to specifynewline=''
, since the csv module does its own (universal) newline handling.
This is precisely the issue you're seeing. So...
with open(file_read, 'r', encoding='utf-8-sig', newline='') as file_r, \
open(file_write, 'w', encoding='utf-8-sig', newline='') as file_w:
csv_reader = csv.reader(file_r, dialect='excel')
csv_writer = csv.writer(file_w, dialect='excel')
# ...
CodePudding user response:
You are on Windows, and you open the file with mode 'w' -- which gives you windows style line endings. Using mode 'wb' should give you the preferred behaviour.