Home > front end >  how to trim string variables in bash
how to trim string variables in bash

Time:11-07

I have strings that initially contain different directory paths, where both the 2nd and 2nd last sub-directories can vary in length, like so

 /home/Leo/Work/CMI/ARCH/MWS/Disks
 /home/Cleo/Work/CMI/ARCH/BK/Disks

I want to trim the first 5 sub-directories and only show the last 2, like so

 echo "/MWS/Disks"
 echo "/BK/Disks"

One way to trim the first 5 sub-directories from the initial strings might be to left-shift each character until both strings start with the second last '/'.

The Bash Beginners Guide describes a shift built-in that left-shifts positional parameters in a command and throws away unused arguments. But it is not immediately obvious whether this could be used to trim the first 5 sub-directories from the strings described above.

In Bash, how do I reduce these strings, preferably without using loops ?


CLARIFICATION

Judging from comments a bit more context is needed. My Bash script recovers historic Mdos and Qdos files from 8-inch floppy disk images and saves files to directories on the hard drive.

For better or worse, I created a bespoke scheme that stores directory paths using 3-character variable names where each name is an acronym for the section of the path to the current directory.

For example MWC is an acronymn for $MY/Work/CMI in the following path

MY="$USER"
MWC="C:/cygwin64/home/$MY/Work/CMI"
cd "$MWC"
pwd
C:/cygwin64/home/$MY/Work/CMI

Similarly 3-character variables point to the next sub-directory further up the tree

WCA="$MWC/ARCH"

i.e. C:/cygwin64/home/$MY/Work/CMI/ARCH, path to a gallery of archive owners.

As directory paths lengthen the 3-character variables make paths easily identified by conserving white space in the listing. Nevertheless the full path appears whenever my script references a path. Hence the need to trim parts of the string that have no interest for the end user.

CodePudding user response:

If the number of subdirectories is always the same, you can use parameter expansion to remove the first 5 subdirectories:

s=/home/Leo/Work/CMI/ARCH/MWS/Disks
s=/${s#/*/*/*/*/*/}
echo $s  # /MWS/Disks

Or, if you know you need the last two parts whatever the depth of the path is:

s=/home/Leo/Work/CMI/ARCH/MWS/Disks
last=/${s##*/}
last_but1=${s%$last}
last_but1=/${last_but1##*/}
echo $last_but1$last  # /MWS/Disks
  • ${s#PATTERN} removes PATTERN from the end of $s.
  • ${s%PATTERN} removes PATTERN form the beginning of $s.
  • with # or %, the shortest match of PATTERN is found. Doubling them makes the match the longest possible.

CodePudding user response:

As an alternative to the parameter expansion, you can use the =~ operator:

dir='/home/Leo/Work/CMI/ARCH/MWS/Disks'
[[ $dir =~ /[^/]*/[^/]*$ ]] && echo "${BASH_REMATCH[0]}"

CodePudding user response:

Assuming the inputs are coming from a file (or streamed/piped from another OS process) ...

Sample input:

$ cat dir.file
 /home/Leo/Work/CMI/ARCH/MWS/Disks
 /home/Cleo/Work/CMI/ARCH/BK/Disks

One awk idea:

awk 'BEGIN {FS=OFS="/"} {print OFS $(NF-1),$(NF)}' dir.file

This generates:

/MWS/Disks
/BK/Disks

If the results need to be stored for later use it shouldn't be too hard to add some code as needed (eg, redirect to a file, pipe to another process, pass as input to a while/read loop, load into an array, etc).

If OP is processing these strings one at a time (eg, as a variable in a loop), I'd probably stick with a parameter substitution solution (see choroba's answer) which doesn't require any overhead to spawn subprocesses.

  • Related