This is the code I have so far. It lower-cases each word in the input file but I am unsure how to check and remove special characters, except for apostrophes, from the input file.
input("Please enter a file name: ")
with open(input(), 'r') as input_file:
for line in input_file:
for word in line.split():
word.lower()
CodePudding user response:
It seems like you are just trying to read the input file and not overwrite it, so what I wrote up just prints the result out.
You can use the Python String isalnum() Method https://www.w3schools.com/python/ref_string_isalnum.asp
according to the doc: "The isalnum() method returns True if all the characters are alphanumeric, meaning alphabet letter (a-z) and numbers (0-9)."
Assuming that meets your requirements, the following should work.
alphanumeric = ""
with open(r"C:\Users\TestUser\Desktop\test.txt", 'r') as input_file:
for line in input_file:
for c in line:
if c == "'":
alphanumeric = "'"
elif c == " ":
alphanumeric = " "
elif c.isalnum():
alphanumeric = c
print(alphanumeric.lower())
CodePudding user response:
You can use:
re.sub(ur'[^a-zA-Z0-9]')