Home > Blockchain >  In Python, how do I parse each word in a text file and make each word lowercase and remove special c
In Python, how do I parse each word in a text file and make each word lowercase and remove special c

Time:11-24

This is the code I have so far. It lower-cases each word in the input file but I am unsure how to check and remove special characters, except for apostrophes, from the input file.

input("Please enter a file name: ")
    with open(input(), 'r') as input_file:
        for line in input_file:
            for word in line.split():
                word.lower()

CodePudding user response:

It seems like you are just trying to read the input file and not overwrite it, so what I wrote up just prints the result out.

You can use the Python String isalnum() Method https://www.w3schools.com/python/ref_string_isalnum.asp

according to the doc: "The isalnum() method returns True if all the characters are alphanumeric, meaning alphabet letter (a-z) and numbers (0-9)."

Assuming that meets your requirements, the following should work.

    alphanumeric = ""
    with open(r"C:\Users\TestUser\Desktop\test.txt", 'r') as input_file:
      for line in input_file:
        for c in line:
           if c == "'":
              alphanumeric  = "'"
           elif c == " ":
              alphanumeric  = " "
           elif c.isalnum():
              alphanumeric  = c
     print(alphanumeric.lower())

CodePudding user response:

You can use:

re.sub(ur'[^a-zA-Z0-9]')
  • Related