My Python module runs in a Linux environment. However, it may be given data that was generated in a Windows environment, and thus the data may contain a Windows-specific file path such as:
>>> raw = `c:\\alpha\\bravo\\foo.txt`
I can wrap that path in a pathlib.PureWindowsPath
object, and that is useful. I can carry on from there in an environment-independent way.
>>> from pathlib import PureWindowsPath
>>> pwp = PureWindowsPath( raw )
>>> pwp
PureWindowsPath('c:/alpha/bravo/foo.txt')
>>> pwp.as_posix()
'c:/alpha/bravo/foo.txt'
>>> pwp.parts
('c:\\', 'alpha', 'bravo', 'foo.txt')
>>> keep = pwp.parts[ pwp.parts.index( 'bravo' ) : ]
>>> keep
>>> ('bravo', 'foo.txt' )
However, how can I detect when to use a PureWindowsPath
?
If I wrap the path in just a pathlib.PurePath
object, it does create a derived PurePosixPath
for that. However, the raw path value is not interpreted in a useful way; the directory separators are not corrected; the path is not correctly split into parts.
>>> from pathlib import PurePath
>>> pp = PurePath( raw )
>>> pp
PurePosixPath('c:\\alpha\\bravo\\foo.txt')
>>> pp.as_posix()
'c:\\alpha\\bravo\\foo.txt'
>>> pp.parts
('c:\\alpha\\bravo\\foo.txt',)
Is there some other pathlib
API that can automatically detect when the raw path value contains, as the documentation puts it, "semantics appropriate for" Windows? So far, I must depend on a human to decide that for me and pass an argument to my Python module that informs me to use PureWindowsPath
.
Documentation found:
CodePudding user response:
One alternative is to always wrap the raw path value in a PureWindowsPath
object.
That is sufficient for my needs when the path is Windows-specific, as demonstrated in the original post.
That is sufficient for my needs when the path is Linux-specific, as demonstrated below.
>>> from pathlib import PureWindowsPath
>>> raw = '/home/fred/alpha/bravo/foo.txt'
>>> pwp = PureWindowsPath( raw )
>>> pwp
PureWindowsPath('/home/fred/alpha/bravo/foo.txt')
>>> pwp.as_posix()
'/home/fred/alpha/bravo/foo.txt'
>>> pwp.parts
('\\', 'home', 'fred', 'alpha', 'bravo', 'foo.txt')
>>> keep = pwp.parts[ pwp.parts.index( 'bravo' ) : ]
>>> keep
('bravo', 'foo.txt')
CodePudding user response:
Another alternative is to always wrap the raw path value in a PureWindowsPath
object, and then wrap the POSIX representation of that in a PurePath
object.
That double-wrapping works better, especially when the Windows-specific path begins with drive letter.
For a Windows-specific path:
>>> from pathlib import PurePath, PureWindowsPath
>>> raw = 'c:\\alpha\\bravo\\foo.txt'
>>> pwp = PureWindowsPath( raw )
>>> pwp
PureWindowsPath('c:/alpha/bravo/foo.txt')
>>> pwp.parts
('c:\\', 'alpha', 'bravo', 'foo.txt')
>>> pp = PurePath( pwp.as_posix() )
>>> pp
PurePosixPath('c:/alpha/bravo/foo.txt')
>>> pp.parts
('c:', 'alpha', 'bravo', 'foo.txt')
>>> idx = pp.parts.index( 'bravo' )
>>> pp.parts[:idx]
('c:', 'alpha')
>>> pp.parts[idx:]
('bravo', 'foo.txt')
For a Linux-specific path:
>>> from pathlib import PurePath, PureWindowsPath
>>> raw = '/home/fred/alpha/bravo/foo.txt'
>>> pwp = PureWindowsPath( raw )
>>> pwp
PureWindowsPath('/home/fred/alpha/bravo/foo.txt')
>>> pwp.parts
('\\', 'home', 'fred', 'alpha', 'bravo', 'foo.txt')
>>> pp = PurePath( pwp.as_posix() )
>>> pp
PurePosixPath('/home/fred/alpha/bravo/foo.txt')
>>> pp.parts
('/', 'home', 'fred', 'alpha', 'bravo', 'foo.txt')
>>> idx = pp.parts.index( 'bravo' )
>>> pp.parts[:idx]
('/', 'home', 'fred', 'alpha')
>>> pp.parts[idx:]
('bravo', 'foo.txt')