Home > Mobile >  How to create new names for files with problematic characters for use in an existing bash scripted e
How to create new names for files with problematic characters for use in an existing bash scripted e

Time:09-12

The goal is to get rid of (by changing) filenames that give headaches for scripting by translating them to something else. The reason is that in this nearly 30 year Unix / Linux environment, with a lot of existing scripts that may not be "written correctly", a new, large and important cache of files arrived that have to be managed, and so, a colleague has asked me to write a script to help with "problematic filenames" and translate them. They've got a list of chars to turn into dots, such as the comma, and another list to turn into underscores, such as whitespace, as but two examples and ran into problems which I asked about over here.

I was using tr to do it, but commenters to it said I should perhaps ask just about this instead of how to get tr to work. So, I have!

CodePudding user response:

Parameter expansion can do this for you.

Note that unlike when using tr (as requested on your other question), when using parameter expansion you don't need to use backslashes inside your character class definitions: put the expansion in double quotes and bash will treat the results of that expansion as literal.

#!/usr/bin/env bash
 
toDots='\,;:| @#$%^&*~'
toUnderscores='}{]['"'"'="()`!'

# requires bash 5 : if debug=1, then print what we would do instead of doing it
runOrDebug() {
  if (( debug )); then
    printf '%s\n' "${*@Q}"
  else
    "$@"
  fi
}
 
renameFiles() {
  local name subDots subBoth
 
  for name; do
    subDots=${name//["$toDots"]/.}
    subBoth=${subDots//["$toUnderscores"]/_}
 
    if [[ $subBoth != "$name" ]]; then
      runOrDebug mv -- "$name" "$subBoth"
    fi
  done
}
 
debug=1 renameFiles '[/a],/;[p:r|o\b lem@a#t$i%c]/@(%$^!/(e^n&t*ry)~='

Note that toUnderscores is (except for the single quote in the middle) in single quotes, so all the backslashes in it are part of the variable's data rather than being syntax; because globs use character class syntax from REs, they're parsed as POSIX regular expression character class syntax.

See a demonstration of the technique running at https://ideone.com/kKE7IJ

  • Related