Home > Software engineering >  Is there a better way than this to split a string at commas and periods?
Is there a better way than this to split a string at commas and periods?

Time:09-27

Trying to split a randomly generated string of letters, commas, periods, and spaces at the commas and periods, but I've only figured out how to split it at the commas with this code:

import re
    with open('book.txt', 'r') as file_object:
          for line in file_object:
            word_list = list(ast.literal_eval(re.subn(r'(\w )', r"'\1'", file_object.readline())[0]))

example string s,wgzggarhz hbmk.q.af mnttxvixkcxwheysijneupvkcmmnar.mhvsflinmk,dvoxuce,vb,f.cfb

End goal is to split it into a list such as ['s', 'wgzggarhz hbmk', 'q', 'af mnttxvixkcxwheysijneupvkcmmnar', 'mhvflinmk', 'dvoxuce', 'vb', 'f', 'cfb']

I'm new to using RegEx's so I don't know if there's a better way to format this or not, but this is the error it's returning.

Traceback (most recent call last):
  File "main.py", line 32, in <module>
    word_list = list(ast.literal_eval(re.subn(r'(\w )', r"'\1'", file_object.readline())[0]))
  File "/nix/store/2vm88xw7513h9pyjyafw32cps51b0ia1-python3-3.8.12/lib/python3.8/ast.py", line 59, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/nix/store/2vm88xw7513h9pyjyafw32cps51b0ia1-python3-3.8.12/lib/python3.8/ast.py", line 47, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 1
    'bazmhffkibauiaexggdoqrvxzkjhqzwammyizcybqba'.'qkmhwbvm' 'cdioyazkwbg' .'bdrsujlrkfxaen'
                                                  ^
SyntaxError: invalid syntax

Using Replit for IDE

CodePudding user response:

Wrapping words in quotes and then evaluating them again is overkill.

You could use .split():

with open('book.txt', 'r') as file_object:
    for line in file_object:
        word_list = re.split(r'\s*[,.]\s*', line)
        print(word_list)

CodePudding user response:

You may just keep it simple and replace all periods with commas (or vice versa) and then use the .split() method to get the desired list of strings.

with open('book.txt', 'r') as file_object:
    for line in file_object:
        word_list = line.replace('.', ',').split(',')
        print(word_list)
  • Related