Home > Enterprise >  Trying to normalize text data by removing caps and quotation marks having issues
Trying to normalize text data by removing caps and quotation marks having issues

Time:05-06

Trying to remove all instances of quotation marks and capital letters in my local outputtext.txt file

This code is based on this website https://www.thepythoncode.com/article/text-generation-keras-python

sequence_length = 100
BATCH_SIZE = 128
EPOCHS = 30
text = open("outputtext.txt",'r',encoding="utf-8")
text = text.lower()
text = text.translate(str.maketrans("", "", punctuation))

However when I run the code; the lower function and the translate function both return errors

lower:AttributeError: '_io.TextIOWrapper' object has no attribute 'lower'

translate: AttributeError: '_io.TextIOWrapper' object has no attribute 'translate'. Did you mean: 'truncate'?

I've tried screwing around with the read and write permissions but that doesn't seem to work?

CodePudding user response:

File contents:

Sam" goes "To" the StOre.

Read in file, edit file, print file contents

with open('outputtext.txt', 'r') as f:
    lines = f.readlines()
    lines = lines[0].lower().replace('"','')
    
    # output: sam goes to the store.
    print(lines)

Read in file, edit file, write lines to new file

with open('outputtext.txt', 'r') as infile:
    lines = infile.readlines()
    lines = lines[0].lower().replace('"','')
    with open('new_outputtext.txt', 'w') as outfile:
        lines = outfile.write(lines)
  • Related