I'm writing a script in bash where I use the grep
function with a regex expression to extract an id which I will be using as a variable.
The goal is to extract all characters until it finds /
, but the caracter '
and }
should be ignored.
file.txt:
{'name': 'projects/data/locations/us-central1/datasets/dataset/source1/messages/B0g2_e8gG_xaZzpbliWvjlShnVdRNEw='}
command:
cat file.txt | grep -oP "[/] ^"
The current command isn't working.
desired output:
B0g2_e8gG_xaZzpbliWvjlShnVdRNEw=
CodePudding user response:
The regex you gave was: [/] ^
It has a few mistakes:
- Your use of
^
at the end seems to imply you think you can ask the software to search backwards - You can't; [/]
matches only the slash character.
Your sample shows what appears to be a malformed JSON object containing a key-value pair, each enclosed in single-quotes. JSON requires double-quotes so perhaps it is not JSON.
If several assumptions are made, it is possible to extract the section of the input that you seem to want:
- file contains a single line; and
- key and value are strings surrounded by single-quote; and
- either:
- the value part is immediately followed by
}
; or - the name part cannot contain
/
- the value part is immediately followed by
You are using -P
option to grep, so lookaround operators are available.
(?<=/)[^/] (?=')
- lookbehind declares match is preceded by
/
- one or more non-slash (the match)
- lookahead declares match is followed by
'
[^/] (?='})
- one or more non-slash (the match)
- lookahead declares match is followed by
'
then}
Note that the match begins as early in the line as possible and with greedy
it is as long as possible.
CodePudding user response:
Using any awk:
$ awk -F"[/']" '{print $(NF-1)}' file.txt
B0g2_e8gG_xaZzpbliWvjlShnVdRNEw=
CodePudding user response:
Basic parameter parsing.
$: x="$(<file.txt)" # file contents in x
$: x="${x##*/}" # strip to last / to get rid of 'name'
$: x="${x//[^[:alnum:]=]}" # strip not alphanumeric or = to clean the end
$: echo "$x"
B0g2e8gGxaZzpbliWvjlShnVdRNEw=