I have very little experience in manipulating pdfs using python, and my experience is restricted only to reading using 'pdfreader' a python library. I have a pdf, (which in this case is a past exam paper), I want it to split a page when it encounters a question number, let's say 12 for this example (it would be formatted "12."), and save the split part containing the number 12. in a new pdf. How do I do this?
I'm not a very good programmer so sorry if my question is stupid, but searching on the internet I could not find how to do this.
CodePudding user response:
The solution at the end was to transform the pdf page into an image, crop it where I want it, then back to a pdf. To get the coordinates I had to use pdf miner, to then get the pixels to modify the image I had to make a proportion between the height of the page in pdf coordinates and the height of the image I wanted to create in pixels, so then I could transform the coordinates of one into the coordinates of the other.