How can I change a column of a text file into another format?-CodePudding

I'm currently parsing a data. How can I change the file from the left figure to the right figure?

I'm using the following example:

temp_f = open(file_name, 'r')
write = open(file_name_modified, 'w')

temp1 =[]
# readline_all.py
while True:
    line = temp_f.readline()
    
    if line.startswith('>'):
        write.write(line) 
        
    elif not line.startswith('>'):
        temp1.append(line)             
        
    if not line: break

CodePudding user response：

You can use itertools.groupby. Supposing you have a list of lines,

lines = [
    '> Name 1',
    'Information1',
    'Information2',
    'Information3',
    'Information4',
    'Information5',
    '> Name 2',
    'Information1',
    'Information2',
    'Information3',
    'Information4',
    'Information5',    
    'Information6',    
    '> Name 3',
    'Information1',
    'Information2',
    'Information3',
    'Information4',
]

you can get the expected output by doing

import itertools
for _, value in itertools.groupby(lines, key=lambda line: line.startswith('>')):
    print(' '.join(value))

This will output

> Name 1
Information1 Information2 Information3 Information4 Information5
> Name 2
Information1 Information2 Information3 Information4 Information5 Information6
> Name 3
Information1 Information2 Information3 Information4

Reading and writing is up to you.

CodePudding user response：

I'm guessing that the lines inside of file_name are delimited by \n. If that's the case, then I would do something like this:

f=open(file_name,'r')
l=f.read().split('\n')
new_file_string=''
newest_group=''
for item in l:
    if item[0]=='>':
        new_file_string=new_file_string '\n' newest_group
        newest_group=item '\n'
    else:
        newest_group=newest_group item ' '
new_file_string.lstrip('\n')
f=open(modified_file_name,'w')
f.write(new_file_string)