Home > OS >  find complains if -exec executes tar with --remove-files
find complains if -exec executes tar with --remove-files

Time:04-16

Problem

I am trying to archive and compress some directories (and their contents) on a GNU/Linux machine and have the original directories (and their contents) removed afterwards.

Minimum reproducable example

Here some code to recreate the situation on a GNU/Linux machine:

cd /tmp
mkdir find_and_tar
cd find_and_tar
mkdir files
mkdir texts
touch files/file1 files/file2 files/file3
touch texts/text1

tree should now give the following:

.
├── files
│   ├── file1
│   ├── file2
│   └── file3
└── texts
    └── text1

What I've tried so far

Now, my command to achieve the stated goal thus far is:

find . -mindepth 1 -type d -exec tar --remove-files -cJf {}.tar.xz {} \;

It does what it is supposed to do - tree now gives:

.
├── files.tar.xz
└── texts.tar.xz

BUT the command throws the following warnings:

find: ‘./texts’: No such file or directory
find: ‘./files’: No such file or directory

If I were to remove the --remove-files modifier, the warnings disappear but obviously the original dirs stay around.

Question(s)

  1. Why do these find warnings appear?
  2. How do I avoid them?

Version info

$ tar --version
tar (GNU tar) 1.30
$ find --version
find (GNU findutils) 4.6.0.225-235f

CodePudding user response:

Your problem is that find is still processing the files in the tree during the time that the tar is running.
When you only need to process directories at the top level, your -maxdepth 1 will work. Two alternatives:

Use find option depth for looking in the subdirs first
This might be useful when you need to find directories in different levels:

find . -mindepth 1 -depth -type d -exec tar --remove-files -cJf {}.tar.xz {} \;

Avoid find

for d in */; do
  tar --remove-files -cJf "${d%/}".tar.xz "${d%/}"
done

CodePudding user response:

Analysis

Adding the -D all option to the find call to get more insight into what's going on, gave this (amongst others):

Optimized command line:
 ( -mindepth 1 [est success rate 1] [real success rate 0/0=_] -a [est success rate 0.0922] [real success rate 0/0=_] [need type] -type d [est success rate 0.0922] [real success rate 0/0=_]  ) -a [est success rate 0.0922] [real success rate 0/0=_] -exec tar [est success rate 1] [real success rate 0/0=_]
consider_visiting (early): ‘.’: fts_info=FTS_D , fts_level= 0, prev_depth=-2147483648 fts_path=‘.’, fts_accpath=‘.’
consider_visiting (late): ‘.’: fts_info=FTS_D , isdir=1 ignore=1 have_stat=1 have_type=1
consider_visiting (early): ‘./texts’: fts_info=FTS_D , fts_level= 1, prev_depth=0 fts_path=‘./texts’, fts_accpath=‘texts’
consider_visiting (late): ‘./texts’: fts_info=FTS_D , isdir=1 ignore=0 have_stat=1 have_type=1
consider_visiting (early): ‘./texts’: fts_info=FTS_DNR, fts_level= 1, prev_depth=1 fts_path=‘./texts’, fts_accpath=‘texts’
find: ‘./texts’: No such file or directory
consider_visiting (early): ‘./files’: fts_info=FTS_D , fts_level= 1, prev_depth=1 fts_path=‘./files’, fts_accpath=‘files’
consider_visiting (late): ‘./files’: fts_info=FTS_D , isdir=1 ignore=0 have_stat=1 have_type=1
consider_visiting (early): ‘./files’: fts_info=FTS_DNR, fts_level= 1, prev_depth=1 fts_path=‘./files’, fts_accpath=‘files’
find: ‘./files’: No such file or directory
consider_visiting (early): ‘.’: fts_info=FTS_DP, fts_level= 0, prev_depth=1 fts_path=‘.’, fts_accpath=‘.’
consider_visiting (late): ‘.’: fts_info=FTS_DP, isdir=1 ignore=1 have_stat=1 have_type=1

Please note that the second visit in e.g. ./texts is only different to the previous visit in the prev_depth value. I figured that find tries to recurse the whole directory tree even after tar has already removed the top level directory.

One can see the impact of this recursion by slightly adapting the scenario:

  • adding a subfiles dir to files
  • and running the find call without --remove-files and -maxdepth

This will lead to the subfiles dir being archived separately within the unarchived files dir.

Solution

Since I know that I want to archive the top level directories, adding the -maxdepth 1 option to my find call solved the problem for me.

  • Related