Re-ordering columns in a csv but with Dictionaries broken-CodePudding

I have a code that is braking because I am trying to reorganize columns but also ignoring other columns on the output csv file.

Input csv file:

book1.csv
A,B,C,D,E,F
a1,b1,c1,d1,e1,F1
a1,b1,c1,d1,e1,F1
a1,b1,c1,d1,e1,
a1,b1,c1,d1,e1,F1
a1,b1,c1,d1,e1,

My code:

import csv

order_of_headers_should_be = ['A', 'C', 'D', 'E', 'B']
dictionary = {'A':'X1','B':'Y1','C':'U1','D':'T1','E':'K1'}
new_headers = [dictionary[old_header] for old_header in order_of_headers_should_be]

with open('Book1.csv', 'r') as infile, open('reordered.csv', 'a') as outfile:
    # output dict needs a list for new column ordering
    writer = csv.DictWriter(outfile, fieldnames = new_headers)
    # reorder the header first
    writer.writeheader()
    for row in csv.DictReader(infile):
        new_row = {dictionary[old_header]: row[old_header] for old_header in row}
        writer.writerow(new_row)

my current output is only the headers (but they are in the correct order):

X1,U1,T1,K1,Y1

Getting an KeyError: 'F' But I need it to also output so it will look like this:

reordered.csv
X1,U1,T1,K1,Y1
a1,c1,d1,e1,b1
a1,c1,d1,e1,b1
a1,c1,d1,e1,b1
a1,c1,d1,e1,b1
a1,c1,d1,e1,b1

CodePudding user response：

When old_header is F you'll get a KeyError, so the for row loop will stop and you won't get any data rows in the output file.

Add a check for this to ther dictionary comprehension.

new_row = {dictionary[old_header]: value for old_header, value in row.items() if old_header in dictionary}

You could also loop through dictionary instead of row.

new_row = {new_header: row[old_header] for old_header, new_header in dictionary}

CodePudding user response：

Here is a simpler way to use pandas to do the heavy lifting.

import pandas as pd

# Read CSV file into DataFrame df
df = pd.read_csv('Book1.csv')

# delete F column
df = df.drop('F', axis=1)

# rename columns
df.columns = ['X1', 'Y1', 'U1', 'T1', 'K1']

# write to file in desired order
df.to_csv('book_out.csv', index=False,
          columns=['X1', 'U1', 'T1', 'K1', 'Y1'])