Home > front end >  Pandas returns empty dataframe when loading multiple .csv files
Pandas returns empty dataframe when loading multiple .csv files

Time:10-21

I'm absolutely confused as to why this is not working as I don't get an error message.

I have been trying to load multiple .csv files from a folder (all have the same format as they have been collected from the PubMed API) into a single dataframe.

This is my code:

path = "~/Desktop/PubMed/Corpus"

files = glob.glob(path   "/*.csv")
dfs = [pd.read_csv(f, header=None) for f in files]
print(dfs)

This returns just this:

[]

Afterwards, I tested if a single file could be loaded and it can be!

Any help is appreciated to solve this.

CodePudding user response:

Not sure what might be wrong with glob, but I'd suggest using os.listdir as a workaround and then check in your list comprehension wether f has the right extension:

files = os.listdir(path)
dfs = [pd.read_csv(f, header=None) for f in files if f.split('.')[-1] == 'csv']

CodePudding user response:

latest mac version

I am not sure as I did not work on it, but I suspect problem might be caused by using ~ as

Unlike a unix shell, Python does not do any automatic path expansions.

Please try using os.path.expanduser which should take care of it i.e. replace

path = "~/Desktop/PubMed/Corpus"

using

import os
path = os.path.expanduser("~/Desktop/PubMed/Corpus")
  • Related