Home > Enterprise >  How to find a paragraph number in text using python?
How to find a paragraph number in text using python?

Time:09-22

text =OUR elders are often heard reminiscing nostalgicallyabout those good old Portuguese days, the Portuguese and their famous loaves of bread. Those eaters of loaves might have vanished but the makers are still there.We still have amongst us the mixers, the moulders and those who bake the loaves.

Marriage gifts are meaningless without the sweet bread known as the bol, just as a party or a feast loses its charm without bread. Not enough can be said to show how important a baker can be for avillage. The lady of the house must prepare sandwiches on the occasion of her daughter’s engagement. Cakes and bolin has are a must for Christmas as well as other festivals.

From this text I want to find "The lady of the house must prepare sandwiches on the occasion of her daughter’s engagement." this line, its line number and paragraph number.

for ex. paragraph number = 2

here is the code that I have tried.

     to_search="The lady of the house must prepare sandwiches on the occasion of her daughter’s engagement."
     print(re.findall(r"(?:(?<!^\n)\n(?!^\n)|[^\n])*" re.escape(to_search) r"(?:(?<!^\n)\n(?!^\n)|[^\n])*", x, re.DOTALL|re.MULTILINE|re.IGNORECASE))

But this is not working. So, How to find the paragraph number?

CodePudding user response:

How about this approach, assuming it's an exact match?

text = """OUR elders are often heard reminiscing nostalgicallyabout those good old Portuguese days, the Portuguese and their famous loaves of bread. Those eaters of loaves might have vanished but the makers are still there.We still have amongst us the mixers, the moulders and those who bake the loaves.

Marriage gifts are meaningless without the sweet bread known as the bol, just as a party or a feast loses its charm without bread. Not enough can be said to show how important a baker can be for avillage. The lady of the house must prepare sandwiches on the occasion of her daughter’s engagement. Cakes and bolin has are a must for Christmas as well as other festivals."""

to_search = "The lady of the house must prepare sandwiches on the occasion of her daughter’s engagement."

paragraphs = text.split("\n\n")

for i in range(len(paragraphs)):
    paragraph = paragraphs[i]
    if to_search in paragraph:
        print(f"Text found in paragraph number #{i 1}")
        break

CodePudding user response:

Here is one approach. We can split the input text on two or more consecutive newlines to generate a list of all paragraphs. Then, use a list comprehension and check each paragraph for the target text.

text = """OUR elders are often heard reminiscing nostalgicallyabout those good old Portuguese days, the Portuguese and their famous loaves of bread. Those eaters of loaves might have vanished but the makers are still there.We still have amongst us the mixers, the moulders and those who bake the loaves.

Marriage gifts are meaningless without the sweet bread known as the bol, just as a party or a feast loses its charm without bread. Not enough can be said to show how important a baker can be for avillage. The lady of the house must prepare sandwiches on the occasion of her daughter’s engagement. Cakes and bolin has are a must for Christmas as well as other festivals."""
paragraphs = re.split(r'\n{2,}', text)
search = 'The lady of the house must prepare sandwiches on the occasion of her daughter’s engagement.'
indices = [ind   1 for ind, x in enumerate(paragraphs) if re.search(re.escape(search), x)]
print(indices)  # [2]
  • Related