Home > Enterprise >  Need to merge together certain values within extracted list into sublists
Need to merge together certain values within extracted list into sublists

Time:12-08

I am extracting this block of text from a file to convert into a dictionary:

ACC210:
Luther, Martin
Spurgeon, Charles

CS121P:
Bunyan, John
Henry, Matthew
Luther, Martin

CS132S:
Calvin, John
Knox, John
Owen, John

and this is the code I used to open it and create two lists so I can use them to create the dictionary:

with open("classes.txt") as file:
    data = [line.strip() for line in file]
    a = []
    b = []
    for x in data:
        if ':' in x:
            a.append(x)
        else:
            b.append(x)

The lists come out as

['ACC210:', 'CS121P:', 'CS132S:']
['Luther, Martin', 'Spurgeon, Charles', '', 'Bunyan, John', 'Henry, Matthew', 'Luther, Martin', '', 'Calvin, John', 'Knox, John', 'Owen, John', '']

However i need the second list to look like this:

[['Luther, Martin', 'Spurgeon, Charles'],['Bunyan, John', 'Henry, Matthew','Luther, Martin'], ['Calvin, John', 'Knox, John', 'Owen, John']]

How would I go about that?

CodePudding user response:

You can try the following:

lst1, lst2 = [], []
with open('input.txt', 'r') as f:
    for line in map(lambda x: x.rstrip('\n'), f):
        if ':' in line:
            lst1.append(line)
            lst2.append(names := [])
        elif line:
            names.append(line)

print(lst1) # ['ACC210:', 'CS121P:', 'CS132S:']
print(lst2)
# [['Luther, Martin', 'Spurgeon, Charles'], ['Bunyan, John', 'Henry, Matthew', 'Luther, Martin'], ['Calvin, John', 'Knox, John', 'Owen, John']]

The line lst2.append(names := []) requires python 3.8 . If that line does not work for you, use:

names = []
lst2.append(names)

CodePudding user response:

If you are trying to make a dict, you can do:

import re 

di={}
with open(fn) as f:
    for k,v in re.findall(r'(^.*):([\s\S]*?)(?=^$|\Z)', f.read(), flags=re.M):
        di[k]=v.strip().splitlines()

>>> di
{'ACC210': ['Luther, Martin', 'Spurgeon, Charles'], 'CS121P': ['Bunyan, John', 'Henry, Matthew', 'Luther, Martin'], 'CS132S': ['Calvin, John', 'Knox, John', 'Owen, John']}

If you want two lists:

import re 

a,b=[],[]
with open(fn) as f:
    for k,v in re.findall(r'(^.*):([\s\S]*?)(?=^$|\Z)', f.read(), flags=re.M):
        a.append(k)
        b.append(v.strip().splitlines())

>>> a
['ACC210', 'CS121P', 'CS132S']
>>> b
[['Luther, Martin', 'Spurgeon, Charles'], ['Bunyan, John', 'Henry, Matthew', 'Luther, Martin'], ['Calvin, John', 'Knox, John', 'Owen, John']]

You can also do this without the regex:

a,b=[],[]
with open(fn) as f:
    for k, sl in ((sl[0], sl[1:]) 
        for sl in (e.splitlines() 
            for e in f.read().rstrip().split('\n\n'))):
        a.append(k.rstrip(':'))
        b.append(sl)

# same a,b

Same method you can directly create a dict:

with open(fn) as f:
    di={sl[0]:sl[1:] for sl in (e.splitlines() 
            for e in f.read().rstrip().split('\n\n'))}

>>> di
{'ACC210:': ['Luther, Martin', 'Spurgeon, Charles'], 'CS121P:': ['Bunyan, John', 'Henry, Matthew', 'Luther, Martin'], 'CS132S:': ['Calvin, John', 'Knox, John', 'Owen, John']}
  • Related