How to single out the first bit and retaining the last bit of the filename using grep(find)-CodePudding

Greeting

I am writing a bash code to convert decimal to binary from a file name (Ex: 023-124.grf) and unfortunately, I only need to only convert the last 3 numbers of the file without interfering with the first bit (it looks something like this: 124.grf) I had already tried using cut but it is only ethical with a text file and as for grepping, i am still trying to figure out on using this command since I am still relatively new to bash Is there a way to single out the first bit of the filename ?

CodePudding user response：

Well, I'm not sure you completely specified your problem, but luckily, even a very general variation of it can be solved fairly easily, considering that grep allows you to match both digit and non-digit characters.

So to match "the last 3 consecutive digits that are not succeeded by a digit" in any text (even if it looks like "234_blablabla_lololol_343123_blablabla_abc.ext" or "blabla_987123, rather than "555-123.ext"), you could literally translate the quoted definition to a regular expression, and get "123", by using [0-9] to match a digit and [^0-9] to match a non-digit. The latter serves the purpose of narrowing your digits down to the last ones present in the text, by stating that only non-digits may (optionally) succeed them.

E.g.:

echo 234_blablabla_lololol_343999_blablabla_abc.txt | grep '[0-9][0-9][0-9][^0-9]*$' | grep '^...'

999

Of course, there are many other ways to do this. For instance, grep has a -P flag to enable the most powerful kind of regular expression syntax it supports, namely Perl regex. With this, you can avoid a lot of redundant code.

E.g. with Perl regex, you can shorten repeats of the same regex unit ("atom"):

[0-9][0-9][0-9] -> [0-9]{3}

It even provides shorthands for common concepts as "character classes". One of these is "decimal digit", a shorthand for [0-9], denoted as \d:

[0-9]{3} -> \d{3}

You could also use lookaheads and lookbehinds to fetch your 3 digits in one pass, alleviating the need of grepping for the first 3 characters afterwards (the grep '^...' part), but I can't be bothered to look up the particular syntax for that in grep right now.

Now sadly, I would have to think a lot how to generalize the above definition of "the last 3 consecutive digits that are not succeeded by a digit" into "the last 3 consecutive digits", meaning the above regular expression would not match file names where the last run of 3 digits is succeeded by a digit anywhere later in the file name, such as "blabla_12_blabla_123_blabla_56.ext", but I am optimistic that your naming convention does not allow that.

CodePudding user response：

You can use bash primitives to separate out the desired portion of the name. There's probably a slicker way to get the binary conversion of the decimal number, but I like dc:

$ name=023-124.grf
$ base=${name%.*}
$ echo "$base"
023-124
$ suffix=${base##*-}
$ echo $suffix
124
$ echo "$suffix" 2 o p | dc
1111100
$ new_name="${base%%-*}-$(echo $suffix 2 o p | dc).${name##*.}"
$ echo "$new_name"
023-1111100.grf