Home > other >  Read untagged text from XML (or HTML)
Read untagged text from XML (or HTML)

Time:12-24

I have an XML file containing many entries like this:

<query id='LoadRights'>
        <description>Load all user-rights</description>
        SELECT CODE FROM RIGHTS
</query> 

Using the id, I want to read just the untagged line 'SELECT CODE FROM RIGHTS'. Is there an elegant way to do so using jQuery?

I am using Cheerio in a Nodejs application but that is based on jQuery. Thanks in advance

CodePudding user response:

Here's one method, basically using a hidden utility html element and some jquery methods to weed out any 'tagged' content

const getText = el => {
  $('#copy').html($(`${el}`).html());
  $('#copy *').each(function() {
    $(this).text('')
  })
  return $('#copy').text().trim()
};
console.log(getText('#LoadRights'))
#copy {
  display: none;
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<query id='LoadRights'>
  <description>Load all user-rights</description>
  SELECT CODE FROM RIGHTS
</query>

<div id="copy"></div>

CodePudding user response:

you can use camaro for this. note that i use normalize-space() here because i want to trim the newline before and after of the text. remove that if you want the raw text.

const { transform } = require('camaro')

async function main() {
    const xml = `
    <query id='LoadRights'>
        <description>Load all user-rights</description>
        SELECT CODE FROM RIGHTS
</query>`

    template = {
        text: 'normalize-space(query/text())'
    }
    const output = await transform(xml, template)
    console.log(output);
}

main()

output

{ text: 'SELECT CODE FROM RIGHTS' }

In case you have multiple queries, it will look like this

const { transform } = require('camaro')

async function main() {
    const xml = `
    <queries>
        <query id='LoadRights'>
            <description>Load all user-rights</description>
            SELECT CODE FROM RIGHTS
        </query>
        <query id='LoadRights'>
            <description>Load all user-rights</description>
            SELECT CODE FROM RIGHTS 2
        </query>
        <query id='LoadRights'>
            <description>Load all user-rights</description>
            SELECT CODE FROM RIGHTS 3
        </query>
    </queries>`

    template = {
        queries: ['queries/query', 'normalize-space(text())']
    }
    const output = await transform(xml, template)
    console.log(output);
}

main()

and output will be

{
  queries: [
    'SELECT CODE FROM RIGHTS',
    'SELECT CODE FROM RIGHTS 2',
    'SELECT CODE FROM RIGHTS 3'
  ]
}
  • Related