Home > Enterprise >  Google Sheets importxml xpath to parse marcxml
Google Sheets importxml xpath to parse marcxml

Time:01-28

I'm trying to parse XML (MarcXML) in Google Sheets.

For example I try to get the value in the subfield with code="a" in the datafield with tag="245"

MarcXML example I'm trying to parse: https://www.loc.gov/standards/marcxml/Sandburg/sandburg.xml

Google Sheets formula I tried:

=importxml(A1;"//datafield[@tag='245']/subfield[@code='a']")

However with the above formula I get the dreaded error Imported content is empty.

When I use this:

=importxml(A1;"/*")

it does output something (all the values together...)

Since Google Sheet's importxml is outputting something with xpath "/*" I think what I try to do should in theory work? Could someone make a suggestion?

Thank you!

CodePudding user response:

I'm not sure what is the exact problem with your xpath query, but I guess it might be due to the differences of XML and MarcXML.

Anyway this xpath works fine based on the structure of your sample data:

=importxml(A1,"//*[@tag=245]/*[@code='a']")

It searches for any node that has a tag attribute with the value of 245; and then looks for any child node that has a code attribute with the value of 'a'.

  • Related