Home > Mobile >  Extract two values from website source code
Extract two values from website source code

Time:10-25

I'm trying to extract, coding a bash script, two different values "vendor" and "product", from a CVEdetails source code, and store each one in one bash variable. This is vendor=$(requested code) and product=$(requested code).

The code snippet that contais the information I need is:

                    <tr>
                        <th>
                            Vendor
                        </th>
                        <th>
                            Product
                        </th>
                        <th>
                            Vulnerable Versions
                        </th>
                    </tr>
                                                <tr>
                                <td>
                                    <a href="/vendor/45/Apache.html" title="Details for Apache">Apache</a>                              </td>
                                <td><a href="/product/66/Apache-Http-Server.html?vendor_id=45" title="Product Details Apache Http Server">Http Server</a></td>
                                <td class="num">
                                     34                             </td>
                            </tr>

                                        </table>

With this, the information I need is Vendor=Apache and Product=HTTP Server, but the closest code I was able to do by myself is:

wget https://www.cvedetails.com/cve/CVE-2017-3169 &>/dev/null; grep -C 6 "Vulnerable Versions" CVE-2017-3169

Any idea about how to get such info? Thanks in advance!

CodePudding user response:

See an example of how it is simple, when using an API and an appropriate parser:

#!/usr/bin/env bash

API_URL='https://cve.circl.lu/api'

cve_id='CVE-2017-3169'

# Read parsed JSON data
IFS=: read -r _ _ _ vendor product _ < <(
  # Perform API request
  curl -s "$API_URL/cve/$cve_id" |

  # Parse JSON data returned by the API to get only what we need
  jq -r '.vulnerable_product[0]'
)

# Demo what we got
printf 'CVE ID: %s\n' "$cve_id"
printf 'Vendor: %s\n' "${vendor^}"
printf 'Product: %s\n' "${product}"

Sample output:

CVE ID: CVE-2017-3169
Vendor: Apache
Product: http_server

CodePudding user response:

To process structured data like HTML and JSON you should use an appropriate parser. sed, grep, awk and the likes are NOT. For a command-line tool I can highly recommend , which is both an HTML- and JSON-parser!

HTML source

$ xidel -s https://www.cvedetails.com/cve/CVE-2017-3169 -e '
  //table[@id="vulnversconuttable"]//td[position() lt 3]/a
'
Apache
Http Server

([position() = (1,2)] would also work to have it return the 1st and 2nd <td>-node)

$ xidel -s https://www.cvedetails.com/cve/CVE-2017-3169 -e '
  //table[@id="vulnversconuttable"]/(vendor:=.//td[1]/a,product:=.//td[2]/a)
'
vendor := Apache
product := Http Server

$ xidel -s https://www.cvedetails.com/cve/CVE-2017-3169 -e '
  //table[@id="vulnversconuttable"]/(vendor:=.//td[1]/a,product:=.//td[2]/a)
' --output-format=bash
vendor='Apache'
product='Http Server'

$ eval "$(
  xidel -s https://www.cvedetails.com/cve/CVE-2017-3169 -e '
    //table[@id="vulnversconuttable"]/(vendor:=.//td[1]/a,product:=.//td[2]/a)
  ' --output-format=bash
)"

$ printf '%s\n' "$vendor" "$product"
Apache
Http Server

JSON API

$ xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
  $json/(vulnerable_product)(1)
'
cpe:2.3:a:apache:http_server:2.2.2:*:*:*:*:*:*:*

$ xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
  tokenize($json/(vulnerable_product)(1),":")
'
cpe
2.3
a
apache
http_server
2.2.2
*
*
*
*
*
*
*

$ xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
  tokenize($json/(vulnerable_product)(1),":")[position() = (4,5)]
'
apache
http_server

$ xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
  let $a:=tokenize($json/(vulnerable_product)(1),":") return (
    vendor:=$a[4],product:=$a[5]
  )
'
vendor := apache
product := http_server

$ eval "$(
  xidel -s "https://cve.circl.lu/api/cve/CVE-2017-3169" -e '
    let $a:=tokenize($json/(vulnerable_product)(1),":") return (
      vendor:=$a[4],product:=$a[5]
    )
  ' --output-format=bash
)"

$ printf '%s\n' "$vendor" "$product"
apache
http_server
  • Related