I am trying to use the mlcp.bat to extract the following document with URI: /category/[2014] xxx.xml
This is the mlcp command used with parameters:
mlcp.bat export -host localhost -port 8000 -username admin -password admin -mode local -database database-content -output_file_path C:/mlcp/bin/xmlexport -document_selector '/CaseReport/Metadata[id="16594-SSP-M"]' -indented true
After executing the above command, there are no document extracted :( Below is the mlcp output:
INFO contentpump.ContentPump: Job name: local_320491878_1
INFO mapreduce.MarkLogicInputFormat: Fetched 1 forest splits.
INFO mapreduce.MarkLogicInputFormat: Made 2 split(s).
INFO contentpump.LocalJobRunner: completed 0%
INFO contentpump.LocalJobRunner: com.marklogic.mapreduce.MarkLogicCounter:
INFO contentpump.LocalJobRunner: ESTIMATED_INPUT_RECORDS: 35722
INFO contentpump.LocalJobRunner: INPUT_RECORDS: 0
INFO contentpump.LocalJobRunner: OUTPUT_RECORDS: 0
INFO contentpump.LocalJobRunner: Total execution time: 26 sec
== UPDATE == This is the first 3 lines of the XML document content with uri /category/[2014] xxx.xml
<?xml version="1.0" encoding="UTF-8"?>
<CaseReport xlink:type="extended" category="unreported" neutralcitation="[2014] xxx" year="" volume="" series="" pageno="" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:exslt="http://exslt.org/common">
<Metadata id="16594-SSP-M">
CodePudding user response:
The -document_selector
option expects you to specify an XPath that would select documents from the database. You are providing the URI of a document.
Instead, use -query_filter
and specify a query that uses the cts:document-query()
to select with that URI: cts:document-query("/category/[2014] xxx.xml")
This is an example of that query serialized as XML:
-query_filter
<cts:document-query xmlns:cts="http://marklogic.com/cts"><cts:uri>/category/[2014] xxx.xml</cts:uri></cts:document-query>
This is an example of that query serialized as JSON:
-query_filter
{"documentQuery":{"uris":["/category/[2014] xxx.xml"]}}
In order to avoid quotes and escaping issues with the query on the commandline, you would be better off putting this option into an options file and then using the -option_file
option with the path to the file.