I need to extract all of the words that begin with tbl in my text file and list them out or maybe put them into another text file.
I have tried different options but can't seem to find the proper solution.
import re
filename = "test.txt"
pattern = re.compile(r"\btbl", re.IGNORECASE)
with open(filename, "rt") as myfile:
for line in myfile:
if pattern.search(line) != None:
print(line, end='')
CodePudding user response:
You could just store them in a list for now and write them to a file later. Something like this would work, you don't need regex per se:
filename = "test.txt"
with open(filename, "r") as myfile:
words = []
for line in myfile.readlines():
_words = line.rstrip("\n").split(" ")
for word in _words:
if word.startswith("tbl"):
words.append(word)
print(words)
CodePudding user response:
Here is a regex that finds any word starting in "tbl":
(?!\s)tbl[\S]*(?=\s)
How it works:
(?!\s)
: Look behind for any whitespace character
tbl
: literally "tbl"
[\S]*
: zero to unlimited non-whitespace characters
(?=\s)
: Look ahead for any whitespace character
Of course this depends on your expected input and desired output. Do words in the input end in punctuation? Do you want to ignore punctuation? etc