Home > front end >  JQ: group input and and generate output json
JQ: group input and and generate output json

Time:07-29

Consider the below input

 {
   "name": "examplename1",
   "Date1": "value1",
   "Date2": "value2",
   "Date3": "value3"
}
 {
   "name": "examplename1",
   "Date1": "value4",
   "Date2": "value5",
   "Date3": "value6"
}
 {
   "name": "examplename2",
   "Date1": "value7",
   "Date2": "value8",
   "Date3": "value9"
}
 {
   "name": "examplename2",
   "Date1": "value10",
   "Date2":"value11",
   "Date3": "value12"
}

Require output as below

{
 "names": "examplename1",
 "availabledates1":[
  "value1",
  "value4"
 ],
 "availabledates2":[
  "value2",
  "value5"
 ],
 "availabledates3":[
  "value3",
  "value6"
 ]
}
{
 "names": "examplename2",
 "availabledates1":[
  "value7",
  "value10"
 ],
 "availabledates2":[
  "value8",
  "valu11"
 ],
 "availabledates3":[
  "value9",
  "value12"
 ]
}

Using JQ

[inputs] | group_by(.name)[] | [{names: .[].name, availabledates1: [.[].Date1], availabledates2: [.[].Date2], availabledates3: [.[].Date3]}] | unique_by(.names) | .[]

Getting output

{
  "names": "examplename1",
  "availabledates1": [
    "value4"
  ],
  "availabledates2": [
    "value5"
  ],
  "availabledates3": [
    "value6"
  ]
}
{
  "names": "examplename2",
  "availabledates1": [
    "value7",
    "value10"
  ],
  "availabledates2": [
    "value8",
    "value11"
  ],
  "availabledates3": [
    "value9",
    "value12"
  ]
}

Issue 1: This JQ ignores the first row in inputs.

Issue 2: If the input data set is very large this jq takes too much memory and eventually fails to execute as its doing multiple iterations which needs parallel threads.

Can refer this: https://jqplay.org/s/OOHAuv72GAL

Need here more efficient jq which does not fail on large data set and also considers first row in inputs.

CodePudding user response:

Using group_by:

jq -n '
  [inputs] | group_by(.name)[] | {
    names: first.name,
    avaliableDates1: map(.Date1),
    avaliableDates2: map(.Date2),
    avaliableDates3: map(.Date3)
  }
'

Demo

Using reduce:

jq -n '
  (reduce inputs as $i ({}; .[$i.name] |= (
    .names = $i.name
    | .avaliableDates1  = [$i.Date1]
    | .avaliableDates2  = [$i.Date2]
    | .avaliableDates3  = [$i.Date3]
  )))[]
' 

Demo

Output:

{
  "names": "examplename1",
  "avaliableDates1": [
    "value1",
    "value4"
  ],
  "avaliableDates2": [
    "value2",
    "value5"
  ],
  "avaliableDates3": [
    "value3",
    "value6"
  ]
}
{
  "names": "examplename2",
  "avaliableDates1": [
    "value7",
    "value10"
  ],
  "avaliableDates2": [
    "value8",
    "value11"
  ],
  "avaliableDates3": [
    "value9",
    "value12"
  ]
}
  • Related