Home > Mobile >  Python list of words from a file with a specific length * no punctuation included in the words
Python list of words from a file with a specific length * no punctuation included in the words

Time:04-30

Result should be: a list of words with a length bigger than 9 , words should be lower and no punctuation in words, ***only three lines of code in the body of the function. The problem in my code is that is still adding punctuation to my word. I tried with checking just for one exmp. if ch not in one of those ->('-' or '"' or '!') or with r'[.,"!-]'.

I also tried to open the file not using with and it worked, i got the result that i want but using this method i am not gonna respect the part with only 3 lines of code inside body function

import string
min_length = 9
with open('my_file.txt') as file:

    content = ''.join([ch for ch in file if ch not in string.punctuation])
    result = [word.lower() for word in content.split() if len(word)>min_length]


print(result)
'''my output:
['distinctly', 'repeating,', 'entreating', 'entreating', 'hesitating', 'forgiveness', 'wondering,', 'whispered,', '"lenore!"-', 'countenance', '"nevermore."', 'sculptured', '"nevermore."', 'fluttered-', '"nevermore."', '"doubtless,"', 'unmerciful', 'melancholy', 'nevermore\'."', '"nevermore."', 'expressing', 'nevermore!', '"nevermore."', '"prophet!"', 'undaunted,', 'enchanted-', '"nevermore."', '"prophet!"', '"nevermore."', 'upstarting-', 'loneliness', 'unbroken!-', '"nevermore."', 'nevermore!']

as you can see there are still words with punctuation

CodePudding user response:

I got this.

from string import punctuation
with open('test.txt') as f:
    data = f.read().replace('\n','')


for a in punctuation:
    data = data.replace(a,'')

data = list(set([a for a in data.split(' ') if len(a)>9]))
print(data)

output:

There is an empty list because in the given data there not a single word which has more than 9 letters.

CodePudding user response:

I believe this could be an appropriate solution:


from string import punctuation

with open('files/text.txt') as f:

    print(set([a for a in f.read().translate(''.maketrans('', '', ''.join([ p for p in punctuation ])   '\n')).split(' ') if len(a)>9]))


However this is a crime against humanity in terms of readability and I would highly suggest you relax this three line requirement to allow your code to be more understandable in the long run.

  • Related