What am I doing wrong? Here is the code that I attempted:
import glob
import tabula
for filepath in glob.iglob('C:/Users/username/Downloads/folder with space/myfolderwithpdfs/*.pdf'):
tabula.convert_into(filepath, pages="all", output_format='csv')
Error:
TypeError Traceback (most recent call last)
Input In [11], in <cell line: 6>()
5 # transform the pdfs into excel files
6 for filepath in glob.iglob(C:/Users/username/Downloads/folder with space/myfolderwithpdfs/*.pdf'):
----> 7 tabula.convert_into(filepath, pages="all", output_format='csv')
TypeError: convert_into() missing 1 required positional argument: 'output_path'
CodePudding user response:
This will read the pdf files in your Download folder then convert it into tabular using csv format.
import os
import glob
import tabula
path="/Users/username/Downloads/"
for filepath in glob.glob(path '*.pdf'):
name=os.path.basename(filepath)
tabula.convert_into(input_path=filepath,
output_path=path name ".csv",
pages="all")
CodePudding user response:
it appears you have not defined the output_path location for your converted pdf
import glob import tabula
for filepath in glob.iglob('C:/Users/username/Downloads/folder with space/myfolderwithpdfs/*.pdf'): tabula.convert_into(filepath, pages="all", output_format='csv', output_path="C:/Users/username/Downloads/new Folder with CSvs")