I have a pdf file called Question.pdf, and its content is as follows.
I am converting my pdf file to an xlsx file using the python tabula module. However, it writes all the data in the 1st column of my excel file, how can I delete this field? (the part indicated in the red area)
import tabula
df = tabula.read_pdf('Question.pdf', pages=1, lattice=True)[1]
df.columns = df.columns.str.replace('\r', ' ')
data = df.dropna()
data.to_excel('data.xlsx', index=False)
CodePudding user response:
Try this while exporting;
data.to_excel('data.xlsx', index=False, header=None)
Hope this Helps...