Home > other >  Reading, formatting, sorting, and saving a csv file without pandas
Reading, formatting, sorting, and saving a csv file without pandas

Time:10-06

I am given some sample data in a file --> transactions1.csv transactions1

I need to do code a function that will do the following without using pandas:

  1. Open the file
  2. Sort the index column in ascending order
  3. Save the updated data back to the same file

This is the code I currently have

import csv

def sort_index():
    read_file=open("transactions1.csv","r")
    r=csv.reader(read_file)
    lines=list(r)
    sorted_lines=sorted(lines[1:], key=lambda row: row[0])
    read_file.close()
    
    with open('transactions1.csv','w',newline='') as file_writer:
        header=['Index','Day','Month','Year','Category','Amount','Notes'] #-
        writer=csv.DictWriter(file_writer, fieldnames=header)
        writer=csv.writer(file_writer)
        writer.writerows(sorted_lines)
        
    return False

sort_index()

Current output:

1,2,Dec,2021,Transport,5,MRT cost
10,19,May,2020,Transport,25,taxi fare
2,5,May,2021,investment,7,bill
3,2,Aug,2020,Bills,8,small bill
4,15,Feb,2021,Bills,40,phone bill
5,14,Oct,2021,Shopping,100,shopping 20
6,27,May,2021,Others,20,other spend
7,19,Nov,2019,Investment,1000,new invest
8,28,Mar,2020,Food,4,drink
9,18,Nov,2019,Shopping,15,clothes

The code doesn't seem to work because index 10 appears right after index 1. And the headers are missing.

Expected Output:

Index,Day,Month,Year,Category,Amount,Notes
1,2,Dec,2021,Transport,5,MRT cost
10,19,May,2020,Transport,25,taxi fare
2,5,May,2021,investment,7,bill
3,2,Aug,2020,Bills,8,small bill
4,15,Feb,2021,Bills,40,phone bill
5,14,Oct,2021,Shopping,100,shopping 20
6,27,May,2021,Others,20,other spend
7,19,Nov,2019,Investment,1000,new invest
8,28,Mar,2020,Food,4,drink
9,18,Nov,2019,Shopping,15,clothes

CodePudding user response:

Several improvements can be made to the current function.

No need to call list on the reader object, we can sort it directly as it is iterable.

We can extract the headers from the reader object by calling next directly on the reader object.

It is more maintainable if we define a custom key function not using lambda, that way if something changes only the function definition needs to change.

Use a with statement to automatically handle closing the file object.

def _key(row):
    return int(row[0])

def sort_index():
    with open("transactions1.csv", "r") as fin:
        reader = csv.reader(fin)
        header = next(reader)
        rows = sorted(reader, key=_key)
    with open("transactions1.csv", "w", newline="") as fout:
        writer = csv.writer(fout)
        writer.writerow(header)
        writer.writerows(rows)
  • Related