Home > Back-end >  CSV Writer (Python) with CRLF instead of LF
CSV Writer (Python) with CRLF instead of LF

Time:12-23

Hi I am trying to use csv library to convert my CSV file into a new one.

The code that I wrote is the following:

import csv
import re

file_read=r'C:\Users\Comarch\Desktop\Test.csv'
file_write=r'C:\Users\Comarch\Desktop\Test_new.csv'

def find_txt_in_parentheses(cell_txt):
    pattern = r'\(. \)'
    return set(re.findall(pattern, cell_txt))

with open(file_write, 'w', encoding='utf-8-sig') as file_w:
    csv_writer = csv.writer(file_w, lineterminator="\n")
    with open(file_read, 'r',encoding='utf-8-sig') as file_r:
        csv_reader = csv.reader(file_r)
        for row in csv_reader:
            cell_txt = row[0]
            txt_in_parentheses = find_txt_in_parentheses(cell_txt)
            if len(txt_in_parentheses) == 1:
                txt_in_parentheses = txt_in_parentheses.pop()
                cell_txt_new = cell_txt.replace(' '   txt_in_parentheses,'')
                cell_txt_new = txt_in_parentheses   '\n'   cell_txt_new
                row[0] = cell_txt_new
            csv_writer.writerow(row)

The only problem is that in the resulting file (Test_new.csv file), I have CRLF instead of LF. Here is a sample image of:

  • read file on the left
  • write file on the right:

enter image description here

And as a result when I copy the csv column into Google docs Excel file I am getting a blank line after each row with CRLF.

enter image description here

Is it possible to write my code with the use of csv library so that LF is left inside a cell instead of CRLF.

CodePudding user response:

From the documentation of csv.reader

If csvfile is a file object, it should be opened with newline=''1
[...]

Footnotes

1(1,2) If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.

This is precisely the issue you're seeing. So...

with open(file_read, 'r', encoding='utf-8-sig', newline='') as file_r, \
     open(file_write, 'w', encoding='utf-8-sig', newline='') as file_w:
     
    csv_reader = csv.reader(file_r, dialect='excel')
    csv_writer = csv.writer(file_w, dialect='excel')

    # ...

CodePudding user response:

You are on Windows, and you open the file with mode 'w' -- which gives you windows style line endings. Using mode 'wb' should give you the preferred behaviour.

  • Related