I have some PDF documents in which their main content is Vector Graphics (bitmap graphics). Like the following.
IMPORTANT NOTE: These are the only type of operators in the PDF. It does not contain text, images or other type of objects. (I reviewed all the content using PDFBox debugger).
q
0.75 0 0 -0.75 36.12 573.96 cm
0 0 0 rg
0 0 m
2.24 0 l
2.24 5.92 l
3.04 5.92 l
3.04 0 l
5.28 0 l
5.28 -0.8 l
0 -0.8 l
0 0 l
h
f
Q
q
0.75 0 0 -0.75 43.800003 572.04 cm
0 0 0 rg
0 0 m
0 -1.44 -0.96 -1.76 -1.76 -1.76 c
-2.56 -1.76 -3.04 -1.28 -3.2 -0.96 c
-3.2 -0.96 l
-3.2 -3.36 l
-4 -3.36 l
-4 3.36 l
-3.2 3.36 l
-3.2 0.64 l
-3.2 -0.64 -2.56 -0.96 -1.92 -0.96 c
-1.12 -0.96 -0.8 -0.64 -0.8 0.16 c
-0.8 3.36 l
0 3.36 l
0 0 l
h
f
Q
.
.
.
Each block of "q" ended by "Q" seems to be a small image (character in the case of my document).
This is how it looks visually in Adobe Acrobat: Screenshot taken from Adobe Acrobat
I need to determine the bounding boxes values (dimensions such as X-Y coordinates and width and height), like if they were just one object. Like below: Bounding Box representation from Adobe Acrobat
As mentioned above I determined that each "character" is a block of "q and Q" operators in the PDF Content.
I wonder if we can get those dimensions (of the big bounding box) using JAVA and PDFBOX just like Adobe Acrobat is able to do it.
CodePudding user response:
Following the same approach that is posted here:
pdfbox 2.0.2 > Calling of PageDrawer.processPage method caught exceptions
They mentioned that the logic should be placed on the "strokePath()" method, but for my case as mentioned by @TilmanHausherr, I used the "fillPath()" to write my logic there.
Be aware that the class you define should be extend from PDFGraphicsStreamEngine.