I'm capturing images of widgets off of multiple cameras on an inspection system. If the inspection is unsuccessful, the image doesn't get saved. The images are named with the widget's serial number.
So my folder structure might look like
- Camera1
- 1.tif
- 2.tif
- 4.tif
- Camera2
- 2.tif
- 3.tif
- 4.tif
- Camera3
- 1.tif
- 2.tif
- 3.tif
- 4.tif
I want to be able to delete images that don't have a match in all three folders. I don't mind running the solution twice, once between camera1 and camera2, and then again using camera2 and camera 3.
I'm hoping to only be left with the following folder structure.
- Camera1
- 2.tif
- 4.tif
- Camera2
- 2.tif
- 4.tif
- Camera3
- 2.tif
- 4.tif
There are ~12,000 files in each folder for analysis and probably 2%-3% erroneous which need to be removed to continue analysis.
I don't mind prepackaged solutions requiring payment, python, command line, etc.
Thanks much!
CodePudding user response:
As suggested in the comments, next time you ask something on SO, have a shot at it yourself first, and ask about any problems - you learn more that way.
Here's a start, as suggested the code below creates 3 sets with the contents of the folders, determines the intersection of those three sets, and then removes that intersection from the original sets. The result tells you exactly what files you need to remove in each folder:
from pathlib import Path
def find_unmatched(dirs):
# list the (file) contents of the folders
contents = {}
for d in dirs:
contents[d] = set(str(n.name) for n in Path(d).glob('*') if n.is_file())
# decide what the folders have in common
all_files = list(contents.values())
common = all_files[0]
for d_contents in all_files[1:]:
common = common.intersection(d_contents)
# create a dictionary that tells you what to remove
return {d: files - common for d, files in contents.items()}
to_remove = find_unmatched(['photos/Camera1', 'photos/Camera2', 'photos/Camera3'])
print(to_remove)
Result (given the folders in your example sit in a folder called photos
):
{'photos/Camera1': {'1.tif'}, 'photos/Camera2': {'3.tif'}, 'photos/Camera3': {'1.tif', '3.tif'}}
Actually removing the files is some code you can probably figure out yourself.
CodePudding user response:
As said before, you should do your own efforts to solve the problem and just ask for help when you get stuck. However, I have some spare time now, so I wrote a complete Batch solution:
@echo off
setlocal EnableDelayedExpansion
rem Process files in Camera1 folder and populate "F" array elements = 1
cd Camera1
for %%a in (*.tif) do set "F[%%~Na]=1"
rem Process files in Camera2 and *accumulate* files to "F" array
cd ..\Camera2
for %%a in (*.tif) do set /A "F[%%~Na] =1"
rem Process files in Camera3 and accumulate files to "F" array
rem if counter == 3 then file is OK: remove "F" element
rem else: delete file
rem if counter == 1: remove "F" element
cd ..\Camera3
for %%a in (*.tif) do (
set /A "F[%%~Na] =1"
if !F[%%~Na]! equ 3 (
set "F[%%~Na]="
) else (
del %%a
if !F[%%~Na]! equ 1 set "F[%%~Na]="
)
)
rem Remove files of "F" array in both Camera1 and Camera2 folders, ignoring error messages
cd ..
(for /F "tokens=2 delims=[]" %%a in ('set F[') do (
del Camera1\%%a.tif
del Camera2\%%a.tif
)) 2>nul
Please, report the result...