Home > front end >  How to Convert PDF file into CSV file using Python Pandas
How to Convert PDF file into CSV file using Python Pandas

Time:11-26

I have a PDF file, I need to convert it in to CSV file this is my pdf file link enter image description here

as a side issue that first AM in the file is I think at first glance encoded as this block

BT
/F1 12 Tf
1 0 0 1 224.20265 754.6322 Tm
[<001D001E>] TJ
ET

where in that area 1D = A and 1E = M

So If you wish to extract each LINE as it is displayed, by far the simplest way is to use a library such as pdftotext that especially outputs each row of text as seen on page.

Thus using an attack such as tabular comma separated you can expect each AM will be given its own row. Which should by logic be " ",AM," "," " but some extractors should say nan,AM,nan,nan

As text it looks like this enter image description here

then placing in a spreadsheet becomes enter image description here

  • Related