I have a VBA Macro. In that, I have
.Find Text = 'Pollution'
.Replacement Text = '^p^pChemical'
Here, '^p^pChemical'
means Replace the Word Pollution with Chemical and create two empty paragraphs before the word sea.
Before:
After:
Have you noticed that The Word Pollution has been replaced With Chemical and two empty paragraphs preceds it ? This is how I want in Python.
My Code so far:
import docx
from docx import Document
document = Document('Example.docx')
for Paragraph in document.paragraphs:
if 'Pollution' in paragraph:
replace(Pollution, Chemical)
document.add_paragraph(before('Chemical'))
document.add_paragraph(before('Chemical'))
I want to open a word document to find the word, replace it with another word, and create two empty paragraphs before the replaced word.
CodePudding user response:
You can search through each paragraph to find the word of interest, and call insert_paragraph_before
to add the new elements:
def replace(doc, target, replacement):
for par in list(document.paragraphs):
text = par.text
while (index := text.find(target)) != -1:
par.insert_paragraph_before(text[:index].rstrip())
par.insert_paragraph_before('')
par.text = replacement text[index len(target)]
list(doc.paragraphs)
makes a copy of the list, so that the iteration is not thrown off when you insert elements.
Call this function as many times as you need to replace whatever words you have.
CodePudding user response:
This will take the text from the your document, replace the instances of the word pollution with chemical and add paragraphs in between, but it doesn't change the first document, instead it creates a copy. This is probably the safer route to go anyway.
import re
from docx import Document
ref = {"Pollution":"Chemicals", "Ocean":"Sea", "Speaker":"Magnet"}
def get_old_text():
doc1 = Document('demo.docx')
fullText = []
for para in doc1.paragraphs:
fullText.append(para.text)
text = '\n'.join(fullText)
return text
def create_new_document(ref, text):
doc2 = Document()
lines = text.split('\n')
for line in lines:
for k in ref:
if k.lower() in line.lower():
parts = re.split(f'{k}', line, flags=re.I)
doc2.add_paragraph(parts[0])
for part in parts[1:]:
doc2.add_paragraph('')
doc2.add_paragraph('')
doc2.add_paragraph(ref[k] " " part)
doc2.save('demo.docx')
text = get_old_text()
create_new_document(ref, text)
CodePudding user response:
You need to use \n
for new line. Using re
should work like so:
import re
before = "The term Pollution means the manifestation of any unsolicited foregin substance in something. When we talk about pollution on earth, we refer to the contamination that is happening of the natural resources by various pollutants"
pattern = re.compile("pollution", re.IGNORECASE)
after = pattern.sub("\n\nChemical", before)
print(after)
Which will output:
The term
Chemical means the manifestation of any unsolicited foregin substance in something. When we talk about
Chemical on earth, we refer to the contamination that is happening of the natural resources by various pollutants