I want to filter a specific file extension (.xlsx) from the file names in Pandas stored in the same folder. I only want to apply the rest of the script to the files with that extension.
The code that I have created to filter those files is the following one:
path = os.getcwd()
files = os.listdir(path)
df_file = pd.DataFrame(files, columns=['Filename'])
df_file
for files in df_file['Filename']:
if ".xlsx" in files:
t1_list = df_file["Filename"].str.split(' ')
print(t1_list)
Basically, it reads the filenames and store them in a dataframe (df_file). Then, I try to filter by ".xlsx" and store this on a list of lists (t1_list).
But the output that I get is this one:
As you can see, it's not filtering anything. What am I doing wrong?
Thanks
CodePudding user response:
This sounds like task for glob.glob
, in this case you might replace
path = os.getcwd()
files = os.listdir(path)
using
import glob
files = glob.glob("*.xlsx")