I have a collection that is shaped like this:
[
{
_id: ObjectId("5d8e8c9b8f8b9b7b7a8b4567"),
tags: {
language: [ 'en' ],
industries: [ 'agency', 'travel' ],
countries: [ 'ca', 'us' ],
regions: [ 'north-america' ],
}
},
{
_id: ObjectId("5d8e8c9b8f8b9b7b7a8b4568"),
tags: {
language: [ 'en', 'fr' ],
industries: [ 'travel' ],
countries: [ 'ca' ]
}
},
{
_id: ObjectId("5d8e8c9b8f8b9b7b7a8b4569"),
tags: {
language: [ 'en' ],
industries: [ 'agency', 'travel' ],
countries: [ 'ca', 'us' ],
regions: [ 'south-america' ]
}
},
]
and I would like to generate this as a result...
{
//* count of all documents
"count": 3,
//* count of all documents that contain any slug within the given category
"countWithCategorySlug": {
"language": 3,
"industries": 3,
"countries": 3,
"regions": 2
},
//* per category: count of documents that contain that slug in the givin category
"language" {
"en": 3,
"fr": 1
},
"industries" {
"agency": 2,
"travel": 3,
},
"countries" {
"ca": 3,
"us": 2
},
"regions" {
"north-america": 1,
"south-america": 1
}
}
super stuck so any help would be appreciated. :)
The number of categories is unknown and I have a code solution that queries the list of disctint categories and slugs then for each one generates a $group stage... The resultant query is excessively big and there needs to be a better way... problem is that I have absolutely no idea on how to optimize it...
CodePudding user response:
Query
- the first part before the facet is done to seperate them and make for each value 1 document like
[{
"type": "language",
"value": "en",
"_id": ObjectId("5d8e8c9b8f8b9b7b7a8b4567")
},
{
"type": "industries",
"value": "agency",
"_id": ObjectId("5d8e8c9b8f8b9b7b7a8b4567")
},
{
"type": "industries",
"value": "travel",
"_id": ObjectId("5d8e8c9b8f8b9b7b7a8b4567")
},
{
"type": "countries",
"value": "ca",
"_id": ObjectId("5d8e8c9b8f8b9b7b7a8b4567")
}]
- and then facet with 3 fields and count the documents
- and after than transformations to have data on keys like the expected output
ggregate(
[{"$set": {"tags": {"$objectToArray": "$tags"}}},
{"$set":
{"tags":
{"$map":
{"input": "$tags",
"in": {"type": "$$this.k", "value": "$$this.v", "_id": "$_id"}}}}},
{"$unwind": "$tags"},
{"$replaceRoot": {"newRoot": "$tags"}},
{"$unwind": "$value"},
{"$facet":
{"count":
[{"$group": {"_id": null, "count": {"$addToSet": "$_id"}}},
{"$set": {"count": {"$size": "$count"}}}],
"category":
[{"$group": {"_id": "$type", "count": {"$addToSet": "$_id"}}},
{"$set": {"count": {"$size": "$count"}}}],
"values":
[{"$group":
{"_id": "$value",
"type": {"$first": "$type"},
"values": {"$addToSet": "$_id"}}},
{"$set": {"values": {"$size": "$values"}}},
{"$group":
{"_id": "$type",
"values":
{"$push":
{"type": "$type", "value": "$_id", "count": "$values"}}}}]}},
{"$set":
{"count":
{"$getField":
{"field": "count", "input": {"$arrayElemAt": ["$count", 0]}}},
"category":
{"$arrayToObject":
[{"$map":
{"input": "$category",
"in": {"k": "$$this._id", "v": "$$this.count"}}}]},
"values":
{"$arrayToObject":
[{"$map":
{"input": "$values",
"in":
{"k": "$$this._id",
"v":
{"$arrayToObject":
[{"$map":
{"input": "$$this.values",
"in": {"k": "$$this.value", "v": "$$this.count"}}}]}}}}]}}}])
Outputs
[{
"count": 3,
"category": {
"countries": 3,
"industries": 3,
"regions": 2,
"language": 3
},
"values": {
"regions": {
"south-america": 1,
"north-america": 1
},
"countries": {
"us": 2,
"ca": 3
},
"language": {
"fr": 1,
"en": 3
},
"industries": {
"agency": 2,
"travel": 3
}
}
}]