Home > Back-end >  Create Key list from a MongoDB collection - Mongoosh / MongoDB scripting
Create Key list from a MongoDB collection - Mongoosh / MongoDB scripting

Time:09-17

I am currently trying to tackle a problem using MongoDB scripting. Let me try to explain what i want to achieve: Given a MongoDB Document(s), i should be able to generate an array of keys in the document, even if the document has nested documents that too should be shown in the array using dot operator. I was successful doing this using JS. Below is the piece of code that i used in JS:

function getDeepKeys(obj) {
  var keys = [];
  const idRegex = new RegExp('_id.*');
  for (var key in obj) {
    if (!(idRegex.test(key))) { //exclude the id fields from the final keys array
      keys.push(key);
      if (typeof obj[key] === "object") {
        var subkeys = getDeepKeys(obj[key]);
        keys = keys.concat(subkeys.map(function (subkey) {
          return key   "."   subkey;
        }));
      }
    }
  }
  return keys;
}

if i give the below sample input:

{
  "_id": {
    "$oid": "613714789a01802b620c2379"
  },
  "name": "Example 2",
  "address": "Highway 2",
  "documentType": "details",
  "personalDetails": {
    "gender": "male",
    "age": 23,
    "position": null
  }
}

I get the below output:

[
  'name',
  'address',
  'documentType',
  'personalDetails',       
  'personalDetails.gender',
  'personalDetails.age',
  'personalDetails.position'
]

Can i achieve the same in MongoDB scripting. So far i have found an aggregate function that returns only the top level keys and fails to return the inner document keys:

db.customers.aggregate([
    { "$project": {
       "hashmaps": { "$objectToArray": "$$ROOT" } 
    } }, 
    { "$project": {
       "fields": "$hashmaps.k"
    } },
    { "$group": {
        "_id": null,
        "fields": { "$addToSet": "$fields" }
    } },
    { "$project": {
            "keys": {
                "$setDifference": [
                    {
                        "$reduce": {
                            "input": "$fields",
                            "initialValue": [],
                            "in": { "$setUnion" : ["$$value", "$$this"] }
                        }
                    },
                    ["_id"]
                ]
            }
        }
    }
]).toArray()[0]["keys"];

which returns this:

[
  "address",
  "documentType",
  "name",
  "personalDetails"
]

Which is missing the inner document keys. Any help here would be appreciated. This is part of a bigger problem that i solved using JS, however need to solve using MongoDB scripting. My JS solution is here

FYI i am using free tier of Atlas, hence cannot use functions like map-reduce.

Thank you.

CodePudding user response:

Possible, yes. Pretty, no.

Aggregation in MongoDB does not have constructs like looping or recursion.

For the case of documents nested directly in the top-level document, you can filter out any non-object, prepend the top-level field name, and merge the lists of fields:

db.collection.aggregate([
  {$project: {data: {$objectToArray: "$$ROOT"}}},
  {$addFields: {
      fields: "$data.k",
      data: {
        $map: {
          input: {$filter: {
              input: "$data",
              cond: {$eq: ["object",{$type: "$$this.v"}]}
          }},
          in: {
            v: {
              $map: {
                input: {$objectToArray: "$$this.v"},
                as: "sub",
                in: {
                  k: {$concat: ["$$this.k",".","$$sub.k"]},
                  v: "$$sub.v"
                }
              }
            }
          }
        }
      }
  }},
  {$addFields: {
      data: {$reduce: {
          input: "$data",
          initialValue: [],
          in: {$concatArrays: ["$$value", "$$this.v"]}
      }}
  }},
  {$project: {
     fields: {$concatArrays: ["$fields","$data.k"]},
  }}
])

Then if you want to go 2 levels deep, you would need to repeat the middle stages, like:

db.collection.aggregate([
  {$project: {data: {$objectToArray: "$$ROOT"}}},
  {$addFields: {
      fields: "$data.k",
      data: {
        $map: {
          input: {$filter: {
              input: "$data",
              cond: {$eq: ["object",{$type: "$$this.v"}]}
          }},
          in: {
            v: {
              $map: {
                input: {$objectToArray: "$$this.v"},
                as: "sub",
                in: {
                  k: {$concat: ["$$this.k",".","$$sub.k"]},
                  v: "$$sub.v"
                }
              }
            }
          }
        }
      }
  }},
  {$addFields: {
      data: {$reduce: {
          input: "$data",
          initialValue: [],
          in: {$concatArrays: ["$$value", "$$this.v"]}
      }}
  }},
  {$addFields: {
      fields: {$concatArrays: ["$fields","$data.k"]},
      data: {
        $map: {
          input: {$filter: {
              input: "$data",
              cond: {$eq: ["object",{$type: "$$this.v"}]}
          }},
          in: {
            v: {
              $map: {
                input: {$objectToArray: "$$this.v"},
                as: "sub",
                in: {
                  k: {$concat: ["$$this.k",".","$$sub.k"]},
                  v: "$$sub.v"
                }
              }
            }
          }
        }
      }
  }},
  {$addFields: {
      data: {$reduce: {
          input: "$data",
          initialValue: [],
          in: {$concatArrays: ["$$value", "$$this.v"]}
      }}
  }},
  {$project: {
     fields: {$concatArrays: ["$fields","$data.k"]}
  }}
])

While this is somewhat functional (Playground), this only works for directly embedded documents (not for arrays of documents), and it will be necessary to repeat those two stages for every level of nesting you want to handle.

  • Related