Home > Enterprise >  Sort MS Word paragraphs alphabetically with Python
Sort MS Word paragraphs alphabetically with Python

Time:09-22

How can i sort MS Word paragraphs alphabetically with python-docx ?

I tried several things but can't get it working. Is somethings like this code bellow could do the work ?

from docx import Document

document = Document()
document.add_paragraph('B - paragraph two' )
document.add_paragraph('A - paragraph one' )

document.paragraphs.sort(key=lambda x: x.text)

document.save('sorted_paragraphs.docx')

Expected result in sorted_paragraphs.docx:

A - paragraph one
B - paragraph two

ie: Is there a way to do the same things that MS word GUI sort does with python ?

The point is to change the position of the paragraphs in the document so they are displayed in alphabetical order based on the paragraphs first letter.

CodePudding user response:

Something like this should do the trick:

# --- range of paragraphs you want to sort, by paragraph index
# --- note that the last paragraph (18) is not included, consistent
# --- with Python "slice" notation.
start, end = 8, 18

# --- create a sorted list of tuples (pairs) of paragraph-text (the
# --- basis for the sort) and the paragraph `<w:p>` element for each
# --- paragraph in range.
text_element_triples = sorted(
    (paragraph.text, i, paragraph._p)
    for i, paragraph in enumerate(document.paragraphs[start:end])
)

# --- move each paragraph element into the sorted position, starting
# --- with the first one in the list
_, _, last_p = text_element_triples[0]

for _, _, p in text_element_triples[1:]:
    last_p.addnext(p)
    last_p = p
  • Related