Home > Software design >  How do I convert a .txt to a list of lists
How do I convert a .txt to a list of lists

Time:08-22

I need to read a text file file.txt that looks like this:

NaN
NaN
[       From      To Type             When  Price
0  SillyZir  0x4a34  Bid  June 18th, 2022  50000]
NaN
[       From          To Type             When  Price
0  SillyZir  Klima#3171  Bid  June 16th, 2022  60000]

I have tried this code

with open("file.txt") as f:
    lines = [line.rstrip() for line in f]

but my Output looks like this (not right)

['NaN',
 'NaN',
 '[       From      To Type             When  Price',
 '0  SillyZir  0x4a34  Bid  June 18th, 2022  50000]',
 'NaN',
 '[       From          To Type             When  Price',
 '0  SillyZir  Klima#3171  Bid  June 16th, 2022  60000]']

I would like to access the list in the list, but the Code separates the lines after "Price" and I don't know how to work around that...

I have done some research but I couldn't find anything that works. I'm kinda new to python so I would really appreciate some help!

Thank you!

CodePudding user response:

You can use basic logic with branching if-statements, and a regex to split your lists on spaces or tabs:

import re

inside = False  # to signify if we are inside a list
result = []
sublist = []

for line in lines:
    if not inside:
        if line[0] == '[':
            inside = True
            words = re.split('[ \t] ', line)
            sublist.extend(words[1:])
        else:
            result.append(line)

    else:
        words = re.split('[ \t] ', line)
        if line[-1] == ']':
            inside = False
            sublist.extend(words[:-1])
            result.append(sublist)
            sublist = []
        else:
            sublist.extend(words[:-1])

Result:

[
    'NaN',
    'NaN', 
    ['From', 'To', 'Type', 'When', 'Price', '0', 'SillyZir', '0x4a34', 'Bid', 'June', '18th,', '2022'], 
    'NaN', 
    ['From', 'To', 'Type', 'When', 'Price', '0', 'SillyZir', 'Klima#3171', 'Bid', 'June', '16th,', '2022']
]

CodePudding user response:

I have processed the list and extracted the words, see if the result is what you need

with open('file.txt') as f:
    lis = f.read().split('\n')

result = []
for e in lis:
    if e.startswith('['):
        result.append(e.strip('[').split())
    elif e[0].isdigit():
        result.append(e[3:].strip(']').strip().split('  '))
    elif e == 'NaN':
        result.append(e)
print(result)

Output:

['NaN',
 'NaN',
 ['From', 'To', 'Type', 'When', 'Price'],
 ['SillyZir', '0x4a34', 'Bid', 'June 18th, 2022', '50000'],
 'NaN',
 ['From', 'To', 'Type', 'When', 'Price'],
 ['SillyZir', 'Klima#3171', 'Bid', 'June 16th, 2022', '60000']]
  • Related