Home > Back-end >  How do I copy/list files with a numbered name pattern from a much larger file list in a bash for loo
How do I copy/list files with a numbered name pattern from a much larger file list in a bash for loo

Time:04-15

I'm doing basic bash for loops to copy data from one directory to my current directory.

My files are named in this pattern:

stuff.002.morestuff.nc

It's 14 files total, and the names are the same except for the number, but there are hundreds of files in the directory I'm copying from. It goes from 002 to 015. I'm just trying to copy 002,003,004,...,014,015 but it's proving harder than expected; right now I'm just doing echo to make sure I'm getting the names right before I copy tons of gigs of data to my computer.

The best thing I've tried:

files=/path/to/dir/stuff.0*[02...15...03456789].morestuff.nc;
for f in $files; do echo $f; done

And that gives me way too many files with numbers ranging from 002 to 035, which is not what I need.

I appreciate any answers for my basic question, I was really surprised that there was nothing very similar to this. I'll worry about the copying later; the names are driving me crazy right now. Sorry if the format and lingo is off, this is my first question here and I'm still really new to this.

CodePudding user response:

Recent versions of bash allow for number series expansion like {002..015}, so you can do something like:

cp some_dir/stuff.{002..015}.morestuff.nc some_other_dir/

And of course, it's always good practice to echo first if you're not sure:

echo some_dir/stuff.{002..015}.morestuff.nc

Also, if stuff and morestuff are semi-unpredictable, you could use globs to accept a wider range of file names:

echo some_dir/*.{002..015}.*.nc
cp some_dir/*.{002..015}.*.nc some_other_dir/

Note that you may get errors about files missing if not every file from 002 through 015 is actually present in some_dir, but rest assured all the ones that did exist will have been copied, which is the goal.

CodePudding user response:

The best thing I've tried: files=/path/to/dir/stuff.0*[02...15...03456789].morestuff.nc; for f in $files; do echo $f; done

Don't attempt to store multiple filenames in a single variable. That never works well given spaces and other odd characters that are valid in filenames. See BashFAQ #50,

Instead, you can store in an array (but that isn't needed here)

I'm doing basic bash for loops to copy data from one directory to my current directory.

You don't really need a for loop here as @PiMarillon points out, but if that is part of your exercise, then use the file glob (wildcard pattern) in the for loop itself.

Using your list format, you can minimally copy:

cp -a *[.]00[23456789][.]* *[.]01[012345][.]* newdir

(which requires a '.' on either end of the number sequence -- add additional unique characters as needed if the [.]002[.] to [.]015[.] pattern is not unique in the directory)

Or if copying from directory that isn't the $PWD, you can do

from="../some/relative/or/abosulted/path/to"
cp -a "$from"/*[.]00[23456789][.]* "$from"/*[.]01[012345][.]* newdir

(note: the file glob is NOT quoted while the "$from" path is. If you quote the entire thing pathname expansion is suppressed)

NOTE: for both invocations of cp with the file glob, whitespace in filesnames and other odd characters such as a '\n' will cause problems. The preferred solution for handling odd characters in filenames would be to use find with -name or -regex.

To use a for loop for the same purpose, just use the same pattern in the for loop definition, e.g.

for fn in *[.]00[23456789][.]* *[.]01[012345][.]*; do 
  cp -a "$fn" newdir
done

(NOTE: that using the for loop will be much LESS efficient than using a single cp. In the for loop, cp will be invoked 14-times instead of just once. Always avoid calling utilities withing a loop [if possible])

All that said, if using bash, then the brace-expansion sequence expression shown by @PiMarillon is preferred. Though, the cp with *[.]00[23456789][.]* *[.]01[012345][.]* will be portable to any POSIX shell.

  •  Tags:  
  • bash
  • Related