Python find all the words with less than 5 letters in the string in a given text-CodePudding

This code doesn't work right now because I don't know the exact code I should use. I need to print out the number of words containing less than 5 letters. This is what I've been trying:

text = "It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout."
words = text.split()
letterCount = {w: len(w) for w in words}

lessthan5 = '1,2,3,4'
count = 0
if words == lessthan5 :      #i put words to refer to line 3(letter count)
     result = count   1
     print(result)

The output I need is an integer, ex.17. Pls help thank u so much

CodePudding user response：

Hope this help:

text = "It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout."
words = text.split()
count = 0
for word in words:
    if len(word) < 5:
        count = count   1

print(count)

CodePudding user response：

Here's one simple solution using a regular expression. The advantage of using a regular expression rather than just splitting the input at whitespace is that you can more precisely define what a "word" is. For example, you can exclude punctuation, so that "a full stop." counts as three words less than five characters long. (With str.split and a filter on string length, the last word would be "stop." instead of "stop", so it wouldn't show up, leaving only two words.)

>>> import re
>>> rg = re.compile(r"\b\w{1,4}\b")

The above regular expression means "starting at a word boundary, followed by from one to four word characters followed by a word boundary.

>>> text = ("It is a long established fact that a reader"
            " will be distracted by the readable content"
            " of a page when looking at its layout.")
>>> rg.findall(text)
['It', 'is', 'a', 'long', 'fact', 'that', 'a', 'will', 'be', 'by', 'the', 'of', 'a', 'page', 'when', 'at', 'its']
>>> len(rg.findall(text)
17

re.findall, as its name indicates, finds all of the instances of a regular pattern in a string, returning the matches in a vector.