I have the follwing function in python to add a dict as row to a pandas DF that also takes care of creating a first empty DF if there is not yet there. I use the library os but I would like to change to Pathlib since consulting with a Software Developer of my company I was said I should use pathlib and not os.Path for these issues. (note aside, I don't have a CS background)
def myfunc(dictt,filename, folder='', extension='csv'):
if folder == '':
folder = os.getcwd(). #---> folder = Path.cwd()
filename = filename '.' 'csv'
total_file = os.path.join(folder,filename) #<--- this is what I don't get translated
# check if file exists, otherwise create it
if not os.path.isfile(total_file):#<----- if total file is a Path object: totalfile.exists()
df_empty = pd.DataFrame()
if extension=='csv':
df_empty.to_csv(total_file)
elif extension=='pal':
df_empty.to_pkl(total_file)
else:
#raise error
pass
# code to append the dict as row
# ...
First I don't understand why path lib is supposed to be better, and secondly I don't understand how to translate the line above mentioned, i.e. how to really do os.path.join(folder_path, filename) with pathlib notation.
In path lib it seems to be different approaches for windows and other machines, and also I don't see an explanation as to what is a posix path (docs here).
Can anyone help me with those two lines? Insights as to why use Pathlib instead of os.path are Welcome. thanks
CodePudding user response:
First I don't understand why path lib is supposed to be better.
pathlib
provides an object-oriented interface to the same functionality os.path gives. There is nothing inherently wrong about using os.path
. We (the python community) had been using os.path
happily before pathlib came on the scene.
However, pathlib
does make life simpler. Firstly, as mentioned in the comment by Henry Ecker, you're dealing with path objects, not strings, so you have less error checking to do after a path has been constructed, and secondly, the path objects' instance methods are right there to be used.
Can anyone help me with those two lines?
Using your example:
def mypathlibfunc(dictt, filename, folder='', extension='csv'):
if folder == '':
folder = pl.Path.cwd()
else:
folder = pl.Path(folder)
total_file = folder / f'{filename}.{extension}'
if not total_file.exists():
# do your thing
df_empty = pd.DataFrame()
if extension == 'csv':
df_empty.to_csv(total_file)
elif extension == 'pal':
df_empty.to_pickle(total_file)
notes:
- if your function is called with
folder != ''
, then aPath
object is being built from it, this is to ensure thatfolder
has a consistent type in the rest of the function. - child
Path
objects can be constructed using the division operator/
, which is what I did fortotal_file
& I didn't actually need to wrapf'{filename}.{extension}'
in aPath
object. pretty neat! reference pandas.DataFrame.to_[filetype]
methods all accept aPath
object in addition to a path string, so you don't have to worry about modifying that part of your code.
In path lib it seems to be different approaches for windows and other machines, and also I don't see an explanation as to what is a posix path
If you use the Path
object, it will be cross-platform, and you needn't worry about windows & posix paths.