Home > Blockchain >  how set a word limit in paragraph's sentences in python?
how set a word limit in paragraph's sentences in python?

Time:09-08

Need to set a limit when append in a list.

sent = 'Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming.'

I need to set only 5 words in one sentence and append to a list

output should -

sent_list = ['Python is dynamically-typed and garbage-collected.', 'It supports multiple programming paradigms,', 'including structured (particularly procedural), object-oriented', 'and functional programming.']

CodePudding user response:

Try this:

words = sent.split(' ')
sent_list = [' '.join(words[n:n 5]) for n in range(0, len(words), 5)]

CodePudding user response:

A little unorthodox perhaps:

sent_list = [re.sub(r'\s$','',x.group('pattern')) for x in 
     re.finditer('(?P<pattern>([^\s] \s){5}|. $)',sent)]

['Python is dynamically-typed and garbage-collected.',
 'It supports multiple programming paradigms,',
 'including structured (particularly procedural), object-oriented',
 'and functional programming.']

Explanation '(?P<pattern>([^\s] \s){5}|. $)':

  • (?P<pattern> ... ): cosmetic, to create a named capture group.
  • ([^\s] \s){5}: find sequence of non-whitespace characters (one or more) followed by a whitespace; and then repeated 5 times.
  • |. $: once the first option is exhausted, simply get the last bit through to the end.

We use re.finditer to loop through all the match objects and grab the match with x.group('pattern'). All but the last match will have an extra whitespace at the end; one way to get rid of it, is to use re.sub.

CodePudding user response:

sent = 'Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming.'
sent_list = ['Python is dynamically-typed and garbage-collected.', 
            'It supports multiple programming paradigms,', 
            'including structured (particularly procedural), object-oriented', 
            'and functional programming.']

new_list = []
inner_string = ""
sentence_list = sent.split(" ")
for idx, item in enumerate(sentence_list):
    if (idx 1)==1 or (idx 1)%5 != 0:
        if (idx 1) == len(sentence_list):
            inner_string  = item
            new_list.append(inner_string)
        else:
            inner_string  = item   " "
    elif (idx 1)!=1 and (idx 1) % 5 == 0 :
        inner_string  = item
        new_list.append(inner_string)
        inner_string = ""
        
print(new_list)
print(new_list == sent_list)

Output:

['Python is dynamically-typed and garbage-collected.', 'It supports multiple programming paradigms,', 'including structured (particularly procedural), object-oriented', 'and functional programming.']
True
  • Related