Reading CSV file and Writing it to a CSV File-CodePudding

I'll cut the chase, I found a code which "Overwriting a specific row in a csv file using Python's CSV module"

import csv

bottle_list = []

# Read all data from the csv file.
with open('a.csv', 'rb') as b:
    bottles = csv.reader(b)
    bottle_list.extend(bottles)

# data to override in the format {line_num_to_override:data_to_write}. 
line_to_override = {1:['e', 'c', 'd'] }

# Write data to the csv file and replace the lines in the line_to_override dict.
with open('a.csv', 'wb') as b:
    writer = csv.writer(b)
    for line, row in enumerate(bottle_list):
         data = line_to_override.get(line, row)
         writer.writerow(data)

But for whatever reason I can't succeed to run it as I'm encountering error in line 8

_csv.Error: iterator should return strings, not bytes (the file should be opened in text mode)

I tried changing the open mode from rb to rt to read it as text mode, but second errors popups in line 18

TypeError: a bytes-like object is required, not 'str'

Changing both open modes from rb to rt and wb to wt fixed both errors, but the result of the csv files contains empty spaces between rows.

I saw a answered question about this. Link

The csv.writer writes \r\n into the file directly. If you don't open the file in binary mode, it will write \r\r\n because on Windows text mode will translate each \n into \r\n.

Any tips on how to overcome this? I've been wondering around answers for quite too long. Thank you

CodePudding user response：

Those links are from older answers, and seem to deal with Python2. Most notably:

I cannot think of a situation where it'd be correct to open a file in binary for reading CSV; omit "r" entirely for readling (lean on the default), and use just "w" for writing.
For reading and writing, the csv docs recommend using newline="". This ties into the second link you referenced, but writing in binary is not the answer to avoid line-ending problems.
Just declare the list of input rows in the body of the with open(... context, and use the list() function to materialize the reader's iterator into a list.

I also:

Added the start= option for enumerate() to show how to control for the presence of a header in the input CSV.
Renamed some vars, for what I think is easier reading.
Made the input/output files different to follow my own rule, Never overwrite the input in code; move/rename with the OS once you're satisfied with the result.

Here's how to do it in Python3, with this input:

Col1,Col2,Col3
a,b,c
d,e,f
g,h,i

with open("input.csv", newline="") as f:
    reader = csv.reader(f)
    input_rows = list(reader)


# Make these row indexes 1-based, so `1: [...]` means "first row"
override_rows = {
    1: ["A", "B", "C"],
    3: ["G", "H", "I"],
}


with open("output.csv", "w", newline="") as f:
    writer = csv.writer(f)

    # start=0 if your input has header, otherwise start=1
    for row_num, row in enumerate(input_rows, start=0):
        data = override_rows.get(row_num, row)
        writer.writerow(data)

Run that as is and I get:

Col1,Col2,Col3
A,B,C
d,e,f
G,H,I

If I were to remove the header from the input:

a,b,c
d,e,f
g,h,i

but leave start=0, I'd get:

a,b,c
A,B,C
g,h,i

because "a,b,c" is row-0, "d,e,f" is row-1, and my override states that row-1 should be "A,B,C".