Home > OS >  Convert multi line txt data to a CSV
Convert multi line txt data to a CSV

Time:11-12

I was wondering if someone could point me in the right direction.

Here is a sample of my data.

TRANS,"GUS000017787609","","","INSTL","","","","","","",,"","",20211025,
MTPNT,"",45654,"","","","","",,,
ASSET,"","INSTL","METER","","CR","G4SZV-2","FLN",2020,"XXXTYU422000","32","","LI"

I need to somehow turn this kind of information in a CSV using python. I've got thousands of rows of data and each TRANS,MTPNT and ASSET is considered one "row".

Does anyone know what the best type of technique would be to preform ETL on data of this kind?

CodePudding user response:

You could use the grouper recipe to read 3 CSV lines at a time and combine them. For example:

import csv
from itertools import zip_longest, chain

def grouper(iterable, n, fillvalue=None):
    "Collect data into non-overlapping fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)
    
    
with open('input.csv') as f_input, open('output.csv', 'w', newline='') as f_output:    
    csv_input = csv.reader(f_input)
    csv_output = csv.writer(f_output)
    
    for triple_row in grouper(csv_input, 3):
        row = list(chain.from_iterable(triple_row))
        #row[2] = 'test'      # modify 3rd value before writing  
        csv_output.writerow(row)

Giving you:

TRANS,GUS000017787609,,,INSTL,,,,,,,,,,20211025,,MTPNT,,45654,,,,,,,,,ASSET,,INSTL,METER,,CR,G4SZV-2,FLN,2020,XXXTYU422000,32,,LI
  • Related