Home > OS >  Filter file extension in Pandas
Filter file extension in Pandas

Time:11-22

I want to filter a specific file extension (.xlsx) from the file names in Pandas stored in the same folder. I only want to apply the rest of the script to the files with that extension.

The code that I have created to filter those files is the following one:

path = os.getcwd()
files = os.listdir(path)
df_file = pd.DataFrame(files, columns=['Filename'])
df_file

for files in df_file['Filename']:
    if ".xlsx" in files:
        t1_list = df_file["Filename"].str.split(' ')
        print(t1_list)

Basically, it reads the filenames and store them in a dataframe (df_file). Then, I try to filter by ".xlsx" and store this on a list of lists (t1_list).

But the output that I get is this one:

enter image description here

As you can see, it's not filtering anything. What am I doing wrong?

Thanks

CodePudding user response:

This sounds like task for glob.glob, in this case you might replace

path = os.getcwd()
files = os.listdir(path)

using

import glob
files = glob.glob("*.xlsx")
  • Related