Hello stackoverflow community! Help, already broke his head how to implement.
There are, for example, folders: 'D:\left' and 'C:\right'.
They contain the contents: files, directories with files, subdirectories, subdirectories with files. Most of the content is the same, but there may be 'extra' content in 'C:\right' (not matching the content of 'D:\left').
How can I compare the content (what is in) 'С:\right', what is not in 'D:\left' and after that (extra in 'С:\right') delete it so that the folders 'D:\left' and ' C:\right' became identical (in our case, we do not look at the size, time, etc. - purely by the names of their contents).
Tried like this to remove the excess:
difs = list(set(os.listdir('C:\right')) - set(os.listdir('D:\left')))
But this is not enough, because it does not propagate the effect to subdirectories.
Also like this:
from dirsync import sync
sync('D:\left', 'C:\right', 'diff')
But, there I am only interested in a small part of the output, and how exactly to put this output under deletion is simply not clear to me.
Delete everything from 'C:\right' to copy from 0 to 'D:\left' to 'C:\right' is not a solution.
I'm pretty sure the solution is fixated on:
os.walk
But I just can't line it up right :(
Many thanks in advance for any help and I apologize for the stupidity.
I'm attaching screenshots for clarity
Desired result after running the program: Result Result2
CodePudding user response:
You can use Path.rglob
:
from pathlib import Path
pl = Path(path/to/left)
pr = Path(path/to/right)
difference = (set(map(lambda p: p.relative_to(pr), pr.rglob('*'))) -
set(map(lambda p: p.relative_to(pl), pl.rglob('*'))))
Here is an example:
right
file1
file5
dir1
file2
file6
dir2
file3
file7
subdir1
file4
file8
subdir2
file9
subdir3
left
file1
dir1
file2
dir2
file3
subdir1
file4
>>> difference
{PosixPath('dir1/file6'),
PosixPath('file5'),
PosixPath('dir2/subdir3'),
PosixPath('dir2/subdir2'),
PosixPath('dir2/subdir1/file8'),
PosixPath('dir2/subdir2/file9'),
PosixPath('dir2/file7')}
Now you just need to delete all files and directories in difference
.