Home > front end >  How to grab specific text in a grep/awk/sed command
How to grab specific text in a grep/awk/sed command

Time:06-17

I have a command that outputs a dump of information and need to get specific parts of that dump. The output varies in length and format (location of specific text varies in the dump) so I'm unsure what the best way to get it is, guessing some usage of grep/awk/sed but my foo isn't the strongest. Appreciate any help!

Example output might be like this

{
"ExecutionStartDateTime": "2022-06-17T00:56:02.079Z",
"ExecutionElapsedTime": "PT1.58S",
"ExecutionEndDateTime": "2022-06-17T00:56:03.079Z",
"Status": "Success",
"StatusDetails": "Success",
"StandardOutputContent": "UPDATE cfgen SET Value='Some Road, Ina Suburb' WHERE \"Key\"='Location';\nUPDATE cfgen SET Value='1234' WHERE \"Key\"='LocCode';\n● pk.service - Pkg\n   Loaded: loaded (/etc/systemd/system/pk.service; enabled; vendor preset: enabled)\n  Drop-In: /etc/systemd/system/pk.service.d\n           └─override.conf\n   Active: active (running) since Fri 2022-06-17 10:56:03 AEST; 14ms ago\n Main PID: 30738 (pkstart.s)\n    Tasks: 2 (limit: 4915)\n   CGroup: /system.slice/pk.service\n           ├─30738 /bin/bash /hdd/Pkg/pkstart.sh\n           └─30742 ./pkg\n\nJun 17 10:56:03 T_06 systemd[1]: Started Pkg.\nT_06\n",
"StandardOutputUrl": "",
"StandardErrorContent": "",
"StandardErrorUrl": "",
}

All the info comes out after StandardOutputContent where it's updating db tables. Specifically I'd like to grab 'Some Road, Ina Suburb' in this example. Also getting the '1234' LocCode would be a bonus.

I know how to grab an entire line etc but how do I just get the text I need and not everything else?

CodePudding user response:

my-cmd |
jq -r .StandardOutputContent |
grep -Po "(?<=\\bSET Value=')[^'] "

Output:

Some Road, Ina Suburb
1234

Requires GNU grep. It probably also works without jq in the pipeline (see if the command has flags for plain text output (instead of JSON)).

There's no super robust way to parse this without using a real SQL query parser. If the format of the query varies more than your example, a more robust command is:

my-cmd |
jq -r '.StandardOutputContent,.StandardOutputContent2' |
grep -Eo '(^|[^[:alnum:]_])SET[[:space:]] Value[[:space:]]*=[[:space:]]*(["'\''][^"'\''] ["'\'']|[^[:space:]] )' |
sed -E 's/[^=] =[[:space:]]*["'\'']?//; s/["'\'']$//'

This handles space around Value = and single, double or no quotes around the value.

CodePudding user response:

Since your content it's a json. You can use jq.

I used hjson combined with jq because you have an extra comma in the last line. Here link to hjson

"StandardErrorUrl": "",

Here an example

nabil@DESKTOP-8ECTID4:~$ hjson -j stackov.json  | jq .StandardOutputContent
"UPDATE cfgen SET Value='Some Road, Ina Suburb' WHERE \"Key\"='Location';\nUPDATE cfgen SET Value='1234' WHERE \"Key\"='LocCode';\n● pk.service - Pkg\n   Loaded: loaded (/etc/systemd/system/pk.service; enabled; vendor preset: enabled)\n  Drop-In: /etc/systemd/system/pk.service.d\n           └─override.conf\n   Active: active (running) since Fri 2022-06-17 10:56:03 AEST; 14ms ago\n Main PID: 30738 (pkstart.s)\n    Tasks: 2 (limit: 4915)\n   CGroup: /system.slice/pk.service\n           ├─30738 /bin/bash /hdd/Pkg/pkstart.sh\n           └─30742 ./pkg\n\nJun 17 10:56:03 T_06 systemd[1]: Started Pkg.\nT_06\n"
nabil@DESKTOP-8ECTID4:~$
  • Related