Grep only the numbers from linux shell-CodePudding

I have an curl output as below and i need to grep only the numbers from that output.

Curl Output

<h1>JVM</h1><p>Free Memory:2144.78 MB Total Memory:3072.00 MB Max Memory:3072.00 MB</p><table border="0">

Grep Command

 grep -o -i 'Max Memory:.*'  | awk  '{ print $3 }'

Output

MB</p><table

Expected Output : 3072.00

Similarly for Free Memory and Total Memory.

Please help

CodePudding user response：

1st solution: With your shown samples, please try following in GNU grep.

grep -oP 'Total Memory:\K\S ' Input_file

OR in case you want to match exact digits which are coming for memory value then try following:

grep -oP 'Total Memory:\K\d (?:\.\d )?(?=\s)' Input_file

Explanation: Simple explanation would be, using GNU grep's -o and -P options firstly. To print only matched text and to enable PCRE regex flavor. Then in main grep program using regex to match Total Memory: to be searched followed by \K which means if previous match is found then forget the match. Then matching \S means match everything non-space(s) before a space comes which will catch value for Total memory.

2nd solution: In case you want to get 3 values in output for free memory, max and total ones then try following awk code. Written and tested in GNU awk.

awk -v RS='(Free Memory:|Total Memory:|Max Memory:)[^[:space:]] ' 'RT{sub(/.*:/,"",RT);print RT}' Input_file

NOTE: In case your output is not in an Input_file then you can use pipe to your previous command and then run this one.

CodePudding user response：

Here is another gnu grep command to get all memory numbers in one command:

s='<h1>JVM</h1><p>Free Memory:2144.78 MB Total Memory:3072.00 MB Max Memory:3072.00 MB</p><table border="0">'
grep -oP '\w  Memory:\K[\d.] ' <<< "$s"

2144.78
3072.00
3072.00

CodePudding user response：

You can also use sed to extract all the necessary details from the string:

fm=$(sed -n 's/.*Free Memory:\([^ ]*\).*/\1/p' file)
tm=$(sed -n 's/.*Total Memory:\([^ ]*\).*/\1/p' file)
mm=$(sed -n 's/.*Max Memory:\([^ ]*\).*/\1/p' file)

See the online demo:

#!/bin/bash
s='<h1>JVM</h1><p>Free Memory:2144.78 MB Total Memory:3072.00 MB Max Memory:3072.00 MB</p><table border="0">'
fm=$(sed -n 's/.*Free Memory:\([^ ]*\).*/\1/p' <<< "$s")
tm=$(sed -n 's/.*Total Memory:\([^ ]*\).*/\1/p' <<< "$s")
mm=$(sed -n 's/.*Max Memory:\([^ ]*\).*/\1/p' <<< "$s")
echo "fm=$fm, tm=$tm, mm=$mm"
# => fm=2144.78, tm=3072.00, mm=3072.00

Details:

-n suppresses default line output
.*Free Memory:\([^ ]*\).* - matches the whole line that contains
- .* - any zero or more chars
- Free Memory: - a fixed string
- \([^ ]*\) - Group 1 (\1): any zero or more non-space chars
- .* - any zero or more chars
/\1/ - replaces the line matched with Group 1 value
p - prints the result of the successful substitution.

CodePudding user response：

Try this:

grep -o -i 'Max Memory:.*'  | cut -d ':' -f 2 |awk  '{ print $1 }'

CodePudding user response：

I would GNU AWK for this task following way, let file.txt content be

<h1>JVM</h1><p>Free Memory:2144.78 MB Total Memory:3072.00 MB Max Memory:3072.00 MB</p><table border="0">

then

awk 'BEGIN{FPAT="[0-9] [.][0-9] "}{print $1,$2,$3}' file.txt

output

2144.78 3072.00 3072.00

Explanation: I informed GNU AWK that field is 1 or more digits followed by literal . followed by 1 or more digits. I print 1st, 2nd, 3rd field. Disclaimer: I assume that you are interesting only in numbers which have single . inside. Note that 0 in <table border="0"> is not detected. Feel free to adjust to your needs.

(tested in gawk 4.2.1)