Home > Enterprise >  Extract text from HTML file in bash
Extract text from HTML file in bash

Time:09-22

I've got a large HTML page that contains this code line:

<span id="product542" price-amount="78.49" class="classname"><span class="price">78,49&nbsp;€</span></span>

I would like to extract the value "78.49" in bash. I'm trying in this way:

filename='./page.html'
curl link > $filename
number=$(xpath -q -e '//span[@id="product542"][1]/text()' $filename)
echo $number

but this returns a syntax error on the html file. How can I do? Is there an alternative way?

CodePudding user response:

You can do it with grep command: grep -Po "(?<=price-amount=\")[^\"] " page.html
Bash:

filename='./page.html'
curl link > $filename
number=$(grep -Po "(?<=price-amount=\")[^\"] " $filename)
echo $number

CodePudding user response:

Using bash

IFS="= " read -ra price < input_file
echo "${price[4]//\"}"

Using sed

sed 's/"//g;s/.*amount=\([0-9.]*\).*/\1/' input_file

Using awk

awk -F"[= ]" '{gsub(/"/,""); print $5}' input_file

Output

78.49
  •  Tags:  
  • bash
  • Related