Trying to split a randomly generated string of letters, commas, periods, and spaces at the commas and periods, but I've only figured out how to split it at the commas with this code:
import re
with open('book.txt', 'r') as file_object:
for line in file_object:
word_list = list(ast.literal_eval(re.subn(r'(\w )', r"'\1'", file_object.readline())[0]))
example string s,wgzggarhz hbmk.q.af mnttxvixkcxwheysijneupvkcmmnar.mhvsflinmk,dvoxuce,vb,f.cfb
End goal is to split it into a list such as ['s', 'wgzggarhz hbmk', 'q', 'af mnttxvixkcxwheysijneupvkcmmnar', 'mhvflinmk', 'dvoxuce', 'vb', 'f', 'cfb']
I'm new to using RegEx's so I don't know if there's a better way to format this or not, but this is the error it's returning.
Traceback (most recent call last):
File "main.py", line 32, in <module>
word_list = list(ast.literal_eval(re.subn(r'(\w )', r"'\1'", file_object.readline())[0]))
File "/nix/store/2vm88xw7513h9pyjyafw32cps51b0ia1-python3-3.8.12/lib/python3.8/ast.py", line 59, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "/nix/store/2vm88xw7513h9pyjyafw32cps51b0ia1-python3-3.8.12/lib/python3.8/ast.py", line 47, in parse
return compile(source, filename, mode, flags,
File "<unknown>", line 1
'bazmhffkibauiaexggdoqrvxzkjhqzwammyizcybqba'.'qkmhwbvm' 'cdioyazkwbg' .'bdrsujlrkfxaen'
^
SyntaxError: invalid syntax
Using Replit for IDE
CodePudding user response:
Wrapping words in quotes and then evaluating them again is overkill.
You could use .split()
:
with open('book.txt', 'r') as file_object:
for line in file_object:
word_list = re.split(r'\s*[,.]\s*', line)
print(word_list)
CodePudding user response:
You may just keep it simple and replace all periods with commas (or vice versa) and then use the .split()
method to get the desired list of strings.
with open('book.txt', 'r') as file_object:
for line in file_object:
word_list = line.replace('.', ',').split(',')
print(word_list)