Home > Software engineering >  How to use sed to extract the specific substring?
How to use sed to extract the specific substring?

Time:04-14

div  id="current-conditions-body">
    <!-- Graphic and temperatures -->
    <div id="current_conditions-summary"  >
                    <img src="newimages/large/sct.png" alt=""  />
                    <p >Partly Cloudy</p>
        <p >64&deg;F</p>
        <p >18&deg;C</p>

I try to extract the "64" in line 6, I was thinking to use awk '/<p >/{print}', but this only gave me the full line. Then I think I need to use sed, but i don't know how to use sed.

CodePudding user response:

Assumptions:

  • input is nicely formatted as per the sample provided by OP so we can use some 'simple' pattern matching

Modifying OP's current awk code:

# use split() function to break line using dual delimiters ">" and "&"; print 2nd array entry

awk '/<p >/{ n=split($0,arr,"[>&]");print arr[2]}'

# define dual input field delimiter as ">" and "&"; print 2nd field in line that matches search string

awk -F'[>&]' ' /<p >/{print $2}'

Both of these generate:

64

One sed idea:

sed -En 's/.*<p >([^&] )&deg.*/\1/p'

This generates:

64
  • Related