Home > database >  Bash to Grep and Exclude At the Same Time Using Regex
Bash to Grep and Exclude At the Same Time Using Regex

Time:03-17

I want to grep a string from command output but remove a substring at the same time. For instance

String = Active: active (running) since Sat 2022-03-12 20:02:20 PKT; 1h 31min ago

What I want = active (running) since 20:02:20 PKT

Removed

  1. Active:
  2. Sat 2022-03-12
  3. ; 1h 31min ago

To do that I have been using regular expression initially

sudo service sshd status |grep -Po '(?<=Active: )(.*) since (.*);'

active (running) since Mon 2022-03-14 01:06:43 PKT;

Can you tell how can i ignore date as well as last semi-colon ; while keeping the time only and have output exactly like:

active (running) since 01:06:43 PKT

Thanks

CodePudding user response:

You can use

sed -nE 's/^Active:  (.* since ).*([0-9]{2}:[0-9]{2}:[0-9]{2}[^;]*).*/\1\2/p'

Details:

  • -nE - n suppresses default line output and E enables the POSIX ERE regex syntax
  • ^Active: (.* since ).*([0-9]{2}:[0-9]{2}:[0-9]{2}[^;]*).* - finds lines matching
    • ^Active: - start of string, Active: and one or more spaces
    • (.* since ) - Group 1 (\1): any text and then space since space
    • .* - any text
    • ([0-9]{2}:[0-9]{2}:[0-9]{2}[^;]*) - two digits, :, two digits, :, two digits, and then any zero or more chars other than ;
    • .* - the rest of the string
  • \1\2 - concatenated Group 1 and 2 values
  • p - prints the result of the substitution.

See the online demo:

#!/bin/bash
s='Active: active (running) since Sat 2022-03-12 20:02:20 PKT; 1h 31min ago'
sed -nE 's/^Active:  (.* since ).*([0-9]{2}:[0-9]{2}:[0-9]{2}[^;]*).*/\1\2/p' <<< "$s"

Output:

active (running) since 20:02:20 PKT

CodePudding user response:

Append this to your command to use space and ; as field separator:

| awk 'BEGIN{ FS="[ ;]" } { print $2,$3,$4,$7,$8 }'

Output:

active (running) since 20:02:20 PKT

CodePudding user response:

With your shown samples, please try following awk code. Written and tested in GNU awk. Simple explanation would be, creating a shell variable named val and sending its value to awk then in awk program I am using match function to match regex to get required value.

val="Active: active (running) since Sat 2022-03-12 20:02:20 PKT; 1h 31min ago"
echo "$val"  | 
awk '
match($0,/^Active:[[:space:]] active \(running\)[[:space:]] .*[0-9]{4}(-[0-9]{2}){2}[[:space:]] ([0-9]{2}:){2}[0-9]{2}[^;]*/){
  val=substr($0,RSTART,RLENGTH)
  sub(/^Active:[[:space:]] /,"",val)
  sub(/since[[:space:]] \S \s \S /,"since",val)
  print val
}
'

Explanation of regex:

^Active:[[:space:]]        ##Matching value starting from Active: followed by space(s).
active \(running\)         ##matching active followed by a space followed by (running).
[[:space:]] .*[0-9]{4}     ##Matching 1 or more spaces then using greedy match to match 4 occurrences of digits.
(-[0-9]{2}){2}             ##Matching - followed by 2 digits and this whole combination 2 times.
[[:space:]] ([0-9]{2}:){2} ##Matching space(s) followed by 2 digits followed by colon and this whole combination 2 times.
[0-9]{2}[^;]*              ##Matching 2 digits and everything after it till a semi-colon comes.
  • Related