Home > Blockchain >  PyPDF2 decoding issue when adding annotations in Chinese characters with addJS
PyPDF2 decoding issue when adding annotations in Chinese characters with addJS

Time:09-17

I want to use PyPDF2 to add annotations programmatically with the use of addJS, it works very well for Latins characters but not for Chinese character, tried to encode with UTF-8 but seems not work either. Here are the code:

from PyPDF2 import PdfFileWriter, PdfFileReader
Def Test():
    inputPDF = PdfFileReader('./demo/TESTPDFANNOTATION.pdf', "rb")    
    outputPDF = PdfFileWriter()
       
    pages = inputPDF.getNumPages()
    for p in range(pages):
        outputPDF.addPage(inputPDF.getPage(p))

    outputStream = open('./demo/TESTPDFANNOTATIONOUT.pdf', "wb")
    outputPDF.addJS("var annot = this.addAnnot({ \r \
                    page: 0, \r \
                    type: 'FreeText', \r \
                    contents: '你好', \r \
                    textFont: 'csongl', \r \
                    textSize: 10, \r \
                    rect: [200, 300, 200 150, 300 3*12], // height for three lines \r \
                    width: 1, \r \
                    alignment: 1 \r \
                    });")
    outputPDF.write(outputStream)    
    outputStream.close()
    return("ok")

It's weird that If I opened the PDF in notepad text editor, the Chinese characters displayed correctly however when opened with PDF, it shows something like 佀好, which seems not decoded, since they could be decoded with the online convert tool into almost the right Chinese character, not exactly the same for some cases. https://cafewebmaster.com/online_tools/utf_decode

Any advise would be highly appreciated!

Python version: 3.9 OS: Win10

Thanks Stanley

CodePudding user response:

Finally, figured out to use another package PyMuPDF to add annotations programmatically with good support of Chinese characters.

import fitz

def writeAnnotation():
    blue  = (0,0,1)
    gold  = (1,1,0)

    pdfDoc = fitz.open('./demo/TESTPDFANNOTATION.pdf')
    page = pdfDoc[0]

    rect1 = fitz.Rect(100,100,200,150)

    strContent1= "你好!世界"

    a1 = page.addFreetextAnnot(rect1, strContent1, text_color=blue,  fill_color=gold)

    pdfDoc.save("./demo/TESTPDFANNOTATIONOUT.pdf")
    return("Well done!")
  • Related