Home > Enterprise >  Error in printing statements based on the contents of a set in python
Error in printing statements based on the contents of a set in python

Time:04-28

I am currently working on some python (version 3.10.4) code on PyCharm (Community Edition 2021.3.3) using the python-docx library (version 0.8.1.1), that allows check if the paper used in a Word document are A4. The document contains three sections created via 2x Section Break (next page).

The aim of the code (as shown below) is to print specific statements based on the page width and page height of the sections in the document.The code only partially functions to achieve this purpose.

When the whole document is not A4, it should print "Whole document does not contain A4 size paper". But it does not do this, instead printing "Section 1 does not contain A4 size paper". The same type of error applies for when sections 1 and 2 are not A4 (in which it prints "Section 1 does not contain A4 size paper.), when sections 1 and 3 are not A4 (in which it strangely prints "Whole document does not contain A4 size paper.") and when sections 2 and 3 are not A4 (in which it prints "Section 2 does not contain A4 size paper.").

I would like to know how I can resolve these issues to ensure that the code prints the correct statements. I believe the problems may lie in the if-elif block. However I have been largely unsuccessful in finding the source of the issue.

Any form of help would be appreciated. If there are any questions about the code, please ask.

from docx import Document

# access Word document file
WordFile = Document("Report_T3.docx")

# the number in the square brackets refer to the section
# page section starts at 0 as that is what indexing begins at in python
section1 = WordFile.sections[0]  # 1st section (Title page)
section2 = WordFile.sections[1]  # 2nd section (Preliminary pages)
section3 = WordFile.sections[2]  # 3rd section (Introduction - End of Doc)

section1_pgheight = set()  # store section 1 page height in the set section1_pgheight
section1_pgwidth = set()   # store section 1 page width in the set section1_pgwidth
section2_pgheight = set()  # store section 2 page height in the set section2_pgheight
section2_pgwidth = set()   # store section 2 page width in the set section2_pgwidth
section3_pgheight = set()  # store section 3 page height in the set section3_pgheight
section3_pgwidth = set()   # store section 3 page width in the set section3_pgwidth

# loop through all three sections in the document
for section in WordFile.sections:
    # add section 1 page height into the set section1_pgheight
    section1_pgheight.add(section1.page_height)
    # add section 1 page width into the set section1_pgwidth
    section1_pgwidth.add(section1.page_width)
    # add section 2 page height into the set section2_pgheight
    section2_pgheight.add(section2.page_height)
    # add section 2 page width into the set section2_pgwidth
    section2_pgwidth.add(section2.page_width)
    # add section 3 page height into the set section3_pgheight
    section3_pgheight.add(section3.page_height)
    # add section 3 page width into the set section3_pgwidth
    section3_pgwidth.add(section3.page_width)

# check if all pages in all sections are A4; for A4 height == 10692130, width == 7560310
if {10692130} == section1_pgheight == section2_pgheight == section3_pgheight and {7560310} == section1_pgwidth == section2_pgwidth == section3_pgwidth:
    print("Whole document contains A4 size paper.")  # works

# check if all pages in all sections are not A4; for A4 height == 10692130, width == 7560310
elif {10692130} != section1_pgheight != section2_pgheight != section3_pgheight and {7560310} != section1_pgwidth != section2_pgwidth != section3_pgwidth:
    print("Whole document does not contain A4 size paper.")  # not working

# check if page in section 1 is not A4; for A4 height == 10692130, width == 7560310
elif {10692130} != section1_pgheight and {7560310} != section1_pgwidth:
    print("Section 1 does not contain A4 size paper.")  # works

# check if pages in section 2 are not A4; for A4 height == 10692130, width == 7560310
elif {10692130} != section2_pgheight and {7560310} != section2_pgwidth:
    print("Section 2 does not contain A4 size paper.")  # works

# check if pages in section 3 are not A4; for A4 height == 10692130, width == 7560310
elif {10692130} != section3_pgheight and {7560310} != section3_pgwidth:
    print("Section 3 does not contain A4 size paper.")  # works

# check if pages in sections 1 and 2 are not A4; for A4 height == 10692130, width == 7560310
elif {10692130} != section1_pgheight != section2_pgheight and {7560310} != section1_pgwidth and {7560310} != section2_pgwidth:
    print("Sections 1 and 2 do not contain A4 size paper.")  # not working

# check if pages in sections 1 and 3 are not A4; for A4 height == 10692130, width == 7560310
elif {10692130} != section1_pgheight and {10692130} != section3_pgheight and {7560310} != section1_pgwidth and {7560310} != section3_pgwidth:
    print("Sections 1 and 3 do not contain A4 size paper.")  # not working

# check if pages in sections 2 and 3 are not A4; for A4 height == 10692130, width == 7560310
elif {10692130} != section2_pgheight and {10692130} != section3_pgheight and {7560310} != section2_pgwidth and {7560310} != section3_pgwidth:
    print("Sections 2 and 3 do not contain A4 size paper.")  # not working

CodePudding user response:

Note that A4 != A3 != A4 would evaluate to True, so you're not checking if all are not A4 but that if section 1 is not a4 and section 2 is not the same as section 1 and so on...

Instead, you would have to do:

elif section1_pgheight != {10692130} and section2_pgheight != {10692130} and section3_pgheight != {10692130} and section1_pgwidth != {7560310} and section2_pgwidth != {7560310} and section3_pgwidth != {7560310}:
    print("Whole document does not contain A4 size paper.")

I do not see the point of using sets when you can just check for equality, Also, avoid repeating yourself, better code would be:

sections = [] #use a list to perserve order
# original loop was not doing anything so remove it
for section in WordFile.sections:
    sections.append((section.page_height, section.page_width)) #add dimension tuple

def isA4(dimensions): 
    A4_WIDTH = 7560310 #avoid magic numbers
    A4_HEIGHT = 10692130
    height, width = dimensions #unpack dimesnsion tuple
    # define helper which would remove a lot of the repetetive checking
    return height == A4_HEIGHT and width == A4_WIDTH

#get the number of sections that are not a4, also turn them into strings
not_a4 = [str(i   1) for i, section in enumerate(sections) if not isA4(section)]

if len(not_a4) == len(sections):
    print("Whole doc. is not A4!")
elif len(not_a4) == 1:
    #list has one element e.g. [1]
    print(f"Section: {not_a4[0]} is not A4!") 
elif 0 < len(not_a4) < len(sections):
    print(f"Sections {', '.join(not_a4)} are not A4!")
  • Related