I'm working with an huge xml file and I need to get a sample of 500 nodes that are direct children of the root node. I know they are of the same type. I need to get all the children of those 500 nodes.
There is a way to do so in xmlstarlet
?
I'd prefer using this specific package because I'm already using it to do other manipulations of the same file.
I tried looking in the help page of the package but couldn't find a way
CodePudding user response:
You could try:
xmlstarlet sel -t -c "/root/child[position() <= 500]" file.xml
sel
is the standard method for querying XML-t
is always needed when usingsel
-c
is for copying
(whatever you select next in your xpath)/root/child
is the xpath
(replace with actual element names of obviously)[position() <= 500]
selects all nodes whose position (within the root element) is 500 or smaller.
Sometimes, I find that enclosing the path in brackets makes the selection work:
xmlstarlet sel -t -c "(/root/child)[position() <= 500]" file.xml
but generally, the first method should be enough.
So, given an input of:
<root>
<child>...</child>
<child>...</child>
...
</root>
you would get:
<child>...</child><child>...</child>...
Mind you, no syntactically valid XML.
To separate with newlines, try a variation like:
xmlstarlet sel -t -m "/root/child[position() <= 500]" -c "." -n file.xml
-m
just matches the xpath
(doesn't produce output)-c "."
copies the matched node-n
appends a newline after each matched/copied node