Home > OS >  How to use a secondary delimiter for every 6th string generated by using split function on a primary
How to use a secondary delimiter for every 6th string generated by using split function on a primary

Time:08-21

I have a pipe delimited file that ends a record with a newline delimiter after every 6 pipe delimited fields as follows.

uid216|Banana
bunches
nurture|Fail|76|7645|Singer

uid342|Orange
vulture|Pass|56
87|3547|Actor

I was using split function in python to convert the records in the file to a list of strings.

  parts = file_str.split('|')

However, I don't seem to understand how I can use a newline character as delimiter for every 6th string alone. Can someone please help me?

CodePudding user response:

The right way to do this is probably to use Python's csv module for reading delimited files and stream the data from the file rather than reading it all into memory at once. When you read the whole file into a string you essentially have to iterate over it twice.

import csv

def process_file(path):
    with open(path, 'r') as file_handle:
        reader = csv.Reader(file_handle, delimiter='|')
        for row in reader:
            # row is a list whose entries are the fields of the delimited row;
            # do what you want with it.

CodePudding user response:

This is an XY problem. You're looking at the data sideways (figuratively). What you should be doing is looping over records and splitting each one on pipes. That could look something like this:

with open(filename) as f:
    parts = []
    for line in f:
        record = line.rstrip('\n')
        parts.append(record.split('|'))

Or, as a comprehension:

with open(filename) as f:
    parts = [line.rstrip('\n').split('|') for line in f]

CodePudding user response:

parts = []

#iterate over each line by using split on \n
# extend to gather all strings in a single list
for line in file_str.split("\n"):
    parts.extend(line.split("|"))

print(parts)
  • Related