Home > OS >  how to search for a specific term within the xml and extract its value within the specific tag
how to search for a specific term within the xml and extract its value within the specific tag

Time:06-12

I am trying to figure out, first search the term within the specific tag (article tag) and then retrieve the value from that specific tag within the article tag.

I can retrieve the value from a specific tag,

<article>
    <author>
        <name>Example Name 1</name>
        <title>example title 2</title>
    </author>
    <title>article title 1</title>
    <publicationDate>2022-02-12</publicationDate>
    <text>blah1 blah1 blah1</text>
    <reference>10000</reference>
</article>
<article>
    <author>
        <name>Example Name 2</name>
        <title>example title 2</title>
    </author>
    <title>article title 1</title>
    <publicationDate>2022-02-13</publicationDate>
    <text>blah1 blah1 blah1</text>
    <reference>10001</reference>
</article>

xmllint --xpath "string(//title)" file.xml

But how can I search and then retrieve the value within the article tags? It will be each time a different reference number, then I need to extract the value from that specific reference.

Thank you for your help

CodePudding user response:

If I understand your intention correctly, you should be able to parameterize your xpath search string using a bash variable containing the reference number that you are interested in. Note, that I modified your example XML to be wrapped in tags, so you will need to modify the xpath per your XML structure.

Script contents:

#!/bin/bash

ref_no=${1:-10001}
src_xml=${2:-/tmp/foo/s.xml}

title=$(xmllint --xpath "string(/articles/article[reference=${ref_no}]/title)" "${src_xml}")
printf "Reference: %s, Title: %s\n" "${ref_no}" "${title}"

Output:

$ ./script 10000
Reference: 10000, Title: article title 1

$ ./script 10001
Reference: 10001, Title: article title 2

For clarity, here is the test XML that I utilized:

<articles>
<article>
    <author>
        <name>Example Name 1</name>
        <title>example title 2</title>
    </author>
    <title>article title 1</title>
    <publicationDate>2022-02-12</publicationDate>
    <text>blah1 blah1 blah1</text>
    <reference>10000</reference>
</article>
<article>
    <author>
        <name>Example Name 2</name>
        <title>example title 2</title>
    </author>
    <title>article title 2</title>
    <publicationDate>2022-02-13</publicationDate>
    <text>blah1 blah1 blah1</text>
    <reference>10001</reference>
</article>
</articles>
  • Related