Home > front end >  search unique element pairs and transform sorted result to JSON
search unique element pairs and transform sorted result to JSON

Time:12-20

I have below sample input: (They are individual documents)

    <performance>
        <year>2016</year>
        <industry>Financials</industry>
        <benchmark>Healthcare</benchmark>
</performance>
 
    <performance>
        <year>2017</year>
        <industry>Technology</industry>
        <benchmark>Financials</benchmark>
</performance>
 
    <performance>
        <year>2018</year>
        <industry>Technology</industry>
        <benchmark>Financials</benchmark>
</performance>
 
    <performance>
        <year>2019</year>
        <industry>Financials</industry>
        <benchmark>Materials</benchmark>
</performance>
 
    <performance>
        <year>2020</year>
        <industry>Technology</industry>
        <benchmark>Materials</benchmark>
</performance>
 
    <performance>
        <year>2021</year> 
        <industry>Technology</industry>
        <benchmark>Healthcare</benchmark>
  </performance>

I need to find the industry and benchmark pairs, sort the result document on year, and finally transform the pairs to JSON. I would like to use Marklogic’s index to speed up search and transform. The expected output is:

  {
    "Financials": [
"Materials", 
"Healthcare"
    ], 
    "Technology": [
"Healthcare", 
"Materials", 
"Financials"
    ]
 }

My Xquery code:

let $keys := ('Financials', 'Technology')
let $map := map:map()
let $_ :=
  for $key in $keys
  let $query :=  cts:path-range-query("/performance/industry", "=", $key)
  let $v :=  cts:values(cts:path-reference('/performance/benchmark'), (), (), $query)
  return map:put($map, $key, $v)
return xdmp:to-json($map)

Unexpected output:

{
"Financials":[
  "Healthcare", 
  "Materials"
], 
"Technology":[
  "Financials", 
  "Healthcare", 
  "Materials"
]
}

Do I use Xquery in the wrong way or misunderstand how the Marklogic index works? How can I get the correct output? I am fine with Javascript or Xquery.

CodePudding user response:

It seems that when you put a sequence of values in the map, they wind up being sorted.

With this solution, I fetch all of the benchmark and year for each industry, sort the values in descending order by year, dedup with distinct-values() and then put them into an array-object() in order to maintain the sorted order.

let $map := map:map()
let $industries := ('Financials', 'Technology')
let $_ :=
  for $industry in $industries
  let $query :=  cts:path-range-query("/performance/industry", "=", $industry)
  let $tuples := cts:value-tuples((cts:path-reference('/performance/benchmark'), cts:path-reference('/performance/year')), (), $query)
  let $sorted-values :=
    for $tuple in $tuples
    let $benchmark := $tuple[1]
    let $year := $tuple[2]
    order by $year descending
    return $benchmark
  return map:put($map, $industry, array-node{ distinct-values($sorted-values) })
return xdmp:to-json($map)
  • Related