I am trying to figure out, first search the term within the specific tag (article tag) and then retrieve the value from that specific tag within the article tag.
I can retrieve the value from a specific tag,
<article>
<author>
<name>Example Name 1</name>
<title>example title 2</title>
</author>
<title>article title 1</title>
<publicationDate>2022-02-12</publicationDate>
<text>blah1 blah1 blah1</text>
<reference>10000</reference>
</article>
<article>
<author>
<name>Example Name 2</name>
<title>example title 2</title>
</author>
<title>article title 1</title>
<publicationDate>2022-02-13</publicationDate>
<text>blah1 blah1 blah1</text>
<reference>10001</reference>
</article>
xmllint --xpath "string(//title)" file.xml
But how can I search and then retrieve the value within the article tags? It will be each time a different reference number, then I need to extract the value from that specific reference.
Thank you for your help
CodePudding user response:
If I understand your intention correctly, you should be able to parameterize your xpath search string using a bash variable containing the reference number that you are interested in. Note, that I modified your example XML to be wrapped in tags, so you will need to modify the xpath per your XML structure.
Script contents:
#!/bin/bash
ref_no=${1:-10001}
src_xml=${2:-/tmp/foo/s.xml}
title=$(xmllint --xpath "string(/articles/article[reference=${ref_no}]/title)" "${src_xml}")
printf "Reference: %s, Title: %s\n" "${ref_no}" "${title}"
Output:
$ ./script 10000
Reference: 10000, Title: article title 1
$ ./script 10001
Reference: 10001, Title: article title 2
For clarity, here is the test XML that I utilized:
<articles>
<article>
<author>
<name>Example Name 1</name>
<title>example title 2</title>
</author>
<title>article title 1</title>
<publicationDate>2022-02-12</publicationDate>
<text>blah1 blah1 blah1</text>
<reference>10000</reference>
</article>
<article>
<author>
<name>Example Name 2</name>
<title>example title 2</title>
</author>
<title>article title 2</title>
<publicationDate>2022-02-13</publicationDate>
<text>blah1 blah1 blah1</text>
<reference>10001</reference>
</article>
</articles>