Home > Blockchain >  MongoDB: sort nested arrays on each document separately
MongoDB: sort nested arrays on each document separately

Time:10-10

I have a structure like this:

[
  {
    "_id": "abc",
    "field1": [], // array of subdocuments, may be empty
    "field2": "somVal",
    "field3": [  // array of subdocuments, may be empty
      {
        "fieldToSort": "bcdef"
      },
      {
        "fieldToSort": "abcde"
      }
    ]
  },
  {
    "_id": "def",
    "field1": [  // array of subdocuments, may be empty
      {
        "fieldToSort": "bcdef"
      },
      {
        "fieldToSort": "abcde"
      }
    ],
    "field3": []  // array of subdocuments, may be empty

     // field2 is missing
  }
]

My documents have many fields in common, but may not all have the same fields.

The arrays to be sorted may be empty.

I just want to sort the values inside field1 and field3 alphabetically by some field, without affecting the sort order of all of the main documents.

I know that usually one uses unwind on arrays, then sorts, then groups. However, during grouping, I dont want to restore the original document field by field.

Is there another way?

Whatever the solution, the structure of the main documents cannot be altered.

The expected result here would be:

[
  {
    "_id": "abc",
    "field1": [],
    "field2": "somVal",
    "field3": [
      {
        "fieldToSort": "abcde"
      },
      {
        "fieldToSort": "bcdef"
      }
    ]
  },
  {
    "_id": "def",
    "field1": [
      {
        "fieldToSort": "abcde"
      },
      {
        "fieldToSort": "bcdef"
      }
    ],
    "field3": []
  }
]

CodePudding user response:

The below code is for dynamically you can add fields since you mentioned you have multiple fields. As you mentioned, we also do unwind, and group. But not every fields.

  • Make all "key value pair" object to array using $objectToArray
  • Filter only the fields that you need using $filter
  • Deconstruct the array using $unwind
  • Sort it using $sort
  • Reconstruct the array using $group
  • Make again key value pair using $arrayToObject
  • We already have the root document. Override using $replaceRoot with newly sorted object

Here is the code

db.collection.aggregate([
  {
    "$addFields": {
      newArr: { "$objectToArray": "$$ROOT" }
    }
  },
  {
    "$addFields": {
      newArr: {
        "$filter": {
          "input": "$newArr",
          "cond": {
            $in: [ "$$this.k", [ "field1", "field3" ] ]
          }
        }
      },
      root: "$$ROOT"
    }
  },
  { "$unwind": "$newArr" },
  { "$unwind": "$newArr.v" },
  { "$sort": { "newArr.v.fieldToSort": 1} },
  {
    "$group": {
      "_id": { _id: "$_id", field: "$newArr.k" },
      "data": { "$push": "$newArr.v" },
      root: { $first: "$root" }
    }
  },
  {
    "$group": {
      "_id": "$_id._id",
      "data": {
        "$push": {
          k: "$_id.field",
          v: "$data"
        }
      },
      root: { $first: "$root" }
    }
  },
  {
    $project: {
      data: { "$arrayToObject": "$data" },
      root: 1
    }
  },
  {
    "$replaceRoot": {
      "newRoot": {
        "$mergeObjects": [ "$root", "$data" ]
      }
    }
  },
  { $project: { newArr: 0 } }
])

Working Mongo playground

CodePudding user response:

Query

  • does a local array sort, without affecting the sort of the entire collection
  • we dont have sort aggregate operator that works in arrays
  • 2 solutions, use $sort stage , use $function and javascript (javascript is slower in general and here we can avoid it)
  • problem is that $sort stage is used to documents, and we need it for array, so we will make the array, many documents but locally
  • with a "trick" to do a "local" unwind
  • we do $lookup with a dummy collection of 1 document, just to allow us to use pipeline stage operators inside the document
  • keeps only the array in the pipeline, that unwinds sort groups without effecting the sort order of the global collection or any other field
  • this trick can be used, when ever we need a pipeline stage operator, to be used in document-level
  • Its 2 times the same code, one for field1 and one for field3
  • dummy = [{}] (collection with 1 empty document)

Test code here

aggregate(
[{"$lookup": 
    {"from": "dummy",
      "let": {"field1": "$field1"},
      "pipeline": 
      [{"$set": {"field1": "$$field1"}},
        {"$unwind": {"path": "$field1"}},
        {"$sort": {"field1.fieldToSort": 1}},
        {"$group": {"_id": null, "field1": {"$push": "$field1"}}}],
      "as": "field1"}},
  {"$set": 
    {"field1": 
      {"$cond": 
        [{"$eq": ["$field1", []]}, [],
          {"$arrayElemAt": ["$field1.field1", 0]}]}}},
  {"$lookup": 
    {"from": "dummy",
      "let": {"field3": "$field3"},
      "pipeline": 
      [{"$set": {"field3": "$$field3"}},
        {"$unwind": {"path": "$field3"}},
        {"$sort": {"field3.fieldToSort": 1}},
        {"$group": {"_id": null, "field3": {"$push": "$field3"}}}],
      "as": "field3"}},
  {"$set": 
    {"field3": 
      {"$cond": 
        [{"$eq": ["$field3", []]}, [],
          {"$arrayElemAt": ["$field3.field3", 0]}]}}}])

CodePudding user response:

@varman and @Takis have posted good answers

This solution is a bit simpler but will work only for MongoDB version >= 4.4 plus this will have a minor impact on performance compared to builtin MongoDB operators because of $function the operator used

db.collection.aggregate([
    {
      "$project": {
        "field1": {
          "$function": {
            "body": "function(arr) {return arr.sort((a,b) => (a.fieldToSort > b.fieldToSort) ? 1 : ((b.fieldToSort > a.fieldToSort) ? -1 : 0));}",
            "args": ["$field1"],
            "lang": "js"
          }
        },
        "field3": {
          "$function": {
            "body": "function(arr) {return arr.sort((a,b) => (a.fieldToSort > b.fieldToSort) ? 1 : ((b.fieldToSort > a.fieldToSort) ? -1 : 0));}",
            "args": ["$field3"],
            "lang": "js"
          }
        }
      }
    },
    
  ])

MongoDB Playground Sample Execution

  • Related