How do I make strings in a file get into separate sets when a specific word is mentioned?-CodePudding

I'm trying to check if a certain word is mentioned in a file, then the words under it become a part of a set, which then this set would be put in a tuple. For instance, the file would say:

COUNTRIES
America
Canada
Russia
Poland

PEOPLE
George
John
James
Kenny

Which would then become a list like this:

[{'America', 'Canada', 'Russia', 'Poland'}, {'George', 'John', 'James', 'Kenny'}]

I started off by doing this to check if I can start going through each individual string:

input = open('countries.txt', 'r')

l = input.readline()
while l.startswith('COUNTRIES'):
     j = input.readline
     if j == 'PEOPLE'
        break`

This code runs forever and it does not stop. I figured if that I could figure out why it does not stop when it reaches the word people then I could possibly separate the strings under PEOPLE and COUNTRIES into separate sets.

CodePudding user response：

You never change l, so it will ALWAYS start with `"COUNTRIES". That's why the loop never ends. Try this:

header = None
track = {}
for line in open('countries.txt', 'r'):
    line = line.strip()
    if line.isupper():
        header = line
        track[header] = []
    elif line:
        track[header].append( line )

CodePudding user response：

As Tim said you aren't changing l that's why it's infinite loop

Solution:

You can keep track of the current set and if any uppercase word comes you can push it to your list

list_of_sets = []

with open('contries.txt', mode="r") as f:
    current_set = set()
    for index, line in enumerate(f.readlines()):
        line_strip = line.strip()

        if not line_strip:
            continue

        if line_strip.isupper():
            if index:
                list_of_sets.append(current_set)
                current_set = set()
            continue

        current_set.add(line_strip)

    list_of_sets.append(current_set)

print(list_of_sets) # [{'Russia', 'America', 'Poland', 'Canada'}, {'Kenny', 'James', 'George', 'John'}]

words_to_split = ["COUNTRIES", "PEOPLE"]
list_of_sets = []

with open('contries.txt', mode="r") as f:
    current_set = set()
    for index, line in enumerate(f.readlines()):
        line_strip = line.strip()
        if not line_strip:
            continue
        if line_strip in words_to_split:
            if index:
                list_of_sets.append(current_set)
                current_set = set()
            continue
        current_set.add(line_strip)
    list_of_sets.append(current_set)

print(list_of_sets) # [{'Russia', 'America', 'Poland', 'Canada'}, {'Kenny', 'James', 'George', 'John'}]

If you gonna use lowercase chars also. You can keep a list of that chars