Home > OS >  python: transforming os.Path code into Pathlib code
python: transforming os.Path code into Pathlib code

Time:11-01

I have the follwing function in python to add a dict as row to a pandas DF that also takes care of creating a first empty DF if there is not yet there. I use the library os but I would like to change to Pathlib since consulting with a Software Developer of my company I was said I should use pathlib and not os.Path for these issues. (note aside, I don't have a CS background)

def myfunc(dictt,filename, folder='', extension='csv'):
    if folder == '':
        folder = os.getcwd(). #---> folder = Path.cwd()
    filename = filename   '.'   'csv'
    total_file = os.path.join(folder,filename) #<--- this is what I don't get translated
    
    # check if file exists, otherwise create it
    if not os.path.isfile(total_file):#<----- if total file is a Path object: totalfile.exists()
        df_empty = pd.DataFrame()
        if extension=='csv':
            df_empty.to_csv(total_file)
        elif extension=='pal':
            df_empty.to_pkl(total_file)
        else:
            #raise error
            pass

    
    # code to append the dict as row
    # ...

First I don't understand why path lib is supposed to be better, and secondly I don't understand how to translate the line above mentioned, i.e. how to really do os.path.join(folder_path, filename) with pathlib notation.

In path lib it seems to be different approaches for windows and other machines, and also I don't see an explanation as to what is a posix path (docs here).

Can anyone help me with those two lines? Insights as to why use Pathlib instead of os.path are Welcome. thanks

CodePudding user response:

First I don't understand why path lib is supposed to be better.

pathlib provides an object-oriented interface to the same functionality os.path gives. There is nothing inherently wrong about using os.path. We (the python community) had been using os.path happily before pathlib came on the scene.

However, pathlib does make life simpler. Firstly, as mentioned in the comment by Henry Ecker, you're dealing with path objects, not strings, so you have less error checking to do after a path has been constructed, and secondly, the path objects' instance methods are right there to be used.

Can anyone help me with those two lines?

Using your example:

def mypathlibfunc(dictt, filename, folder='', extension='csv'):
    if folder == '':
        folder = pl.Path.cwd()
    else:
        folder = pl.Path(folder)

    total_file = folder / f'{filename}.{extension}'
    if not total_file.exists():
       # do your thing
       df_empty = pd.DataFrame()
       if extension == 'csv':
         df_empty.to_csv(total_file)
       elif extension == 'pal':
         df_empty.to_pickle(total_file)

notes:

  • if your function is called with folder != '', then a Path object is being built from it, this is to ensure that folder has a consistent type in the rest of the function.
  • child Path objects can be constructed using the division operator /, which is what I did for total_file & I didn't actually need to wrap f'{filename}.{extension}' in a Path object. pretty neat! reference
  • pandas.DataFrame.to_[filetype] methods all accept a Path object in addition to a path string, so you don't have to worry about modifying that part of your code.

In path lib it seems to be different approaches for windows and other machines, and also I don't see an explanation as to what is a posix path

If you use the Path object, it will be cross-platform, and you needn't worry about windows & posix paths.

  • Related