Home > other >  Why does my for-loop not work properly in Python?
Why does my for-loop not work properly in Python?

Time:04-13

New in Python, hope ya'll can help

I have a script with this for loop that checks a list of files from a directory and what I need to do is if each file ends with 'xlsx' it does a block of code, if it ends with 'csv' it does other block code. I had to create the variable extension to extract the extension of the file because it showed an error WindowsPath' object has no attribute 'endswith' if I used the file


for file in file_list:
        #to extract the file extension:
        extension = os.path.splitext(file)[1][1:]
        print(file)

        if extension.endswith('xlsx'):
            all_df_list.append(pd.read_excel(file, header=0, engine='openpyxl'))
            a = round(time.time() - start_time,2)
            print(a)
        elif extension.endswith('csv'):
            all_df_list.append(pd.read_csv(file, chunksize=10000))
            a = round(time.time() - start_time,2)
            print(a)
        else:
            print("Not working")

When I run the whole script, it starts the loop and executes the command for the first file, but then it goes all the way down to else statement. It was supposed to do the same for each file... These are the results (put xxx to hide sensitive info)

Z:\DataFiles\XXX\XXX\XXX\XXX\XXX\XXX_XXX_0000.xlsx
2.43
Z:\DataFiles\XXX\XXX\XXX\XXX\XXX\XXX_XXX_2019.XLSX
Not working
Z:\DataFiles\XXX\XXX\XXX\XXX\XXX\XXX_XXX_2020.XLSX
Not working
Z:\DataFiles\XXX\XXX\XXX\XXX\XXX\XXX_XXX_2021.XLSX
Not working
Z:\DataFiles\XXX\XXX\XXX\XXX\XXX\XXXX_2022.XLSX
Not working

Can you point out what I'm doing wrong?

CodePudding user response:

Since file is a WindowsPath object (as indicated by the error), you probably want to use .suffix, and use .lower() to lowercase it before comparing against lowercase extensions:

for file in file_list:
    if file.suffix.lower() == '.xlsx':
        all_df_list.append(pd.read_excel(file, header=0, engine='openpyxl'))
    elif file.suffix.lower() == '.csv':
        all_df_list.append(pd.read_csv(file, chunksize=10000))
    else:
        print("Not working")
        continue
    print(round(time.time() - start_time, 2))

CodePudding user response:

Your script is looking for .xlsx and .csv and therefore .XLSX is treated in the else: block.
You could use the function .lower() when you assign the value to the variable extension.

extension = os.path.splitext(file)[1][1:].lower()

CodePudding user response:

Can't you define extension = os.path.splitext(file)[1][1:] outside the loop?

CodePudding user response:

Use pathlib for pathing and you can get the string from .name!

>>> import pathlib
>>> p = pathlib.Path("foo/bar/baz.EXT")
>>> p.name.lower().endswith(".ext")
True
  • Related