Get a list of unique values from MongoDB Atlas Search before match filters are applied-CodePudding

I use MongoDB Atlas Search to search through a list of resources in my database (I'm using Mongoose, hence the slightly different syntax):

const allApprovedSearchResults = await Resource.aggregate([{
    $search: {
        compound: {
            should: [
                {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["title", "link", "creatorName"],
                        allowAnalyzedField: true,
                    }
                },
                {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["topics"],
                        allowAnalyzedField: true,
                        "score": { "boost": { "value": 2 } },
                    }
                }
                ,
                    {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["description"],
                        allowAnalyzedField: true,
                        score: { "boost": { "value": .2 } },
                    }
                }
            ]
        }
    }
}])
    .match(matchFilter)
    .exec();

const uniqueLanguagesInSearchResults = [...new Set(allApprovedSearchResults.map(resource => resource.language))];

The last line retrieves all unique languages in the results set. However, I want a list of all the languages before .match(matchFilter) is applied. Is there a way to do this without running a second search without the filters?

CodePudding user response：

You can use a $facet after the $search:

.aggregate([
  {
    $search: {
        compound: {
            should: [
                {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["title", "link", "creatorName"],
                        allowAnalyzedField: true,
                    }
                },
                {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["topics"],
                        allowAnalyzedField: true,
                        "score": { "boost": { "value": 2 } },
                    }
                }
                ,
                    {
                    wildcard: {
                        query: queryStringSegmented,
                        path: ["description"],
                        allowAnalyzedField: true,
                        score: { "boost": { "value": .2 } },
                    }
                }
            ]
        }
    }
},
  {
    "$facet": {
      "filter": [
        {$match: matchFilter}
      ],
      "allLanguages ": [
        {$group: {_id: 0, all: {$addToSet: '$language'}}}, //<- replace '$language' with real field name
      ]
    }
  }
])

You did not provide a structure so I'm assuming 'language' is the field name. The $facet creates a fork - one part called 'filter' will contain only the filtered results, while the other one, called allLanguages, will contain a set of all languages, regardless of the filter.You can add $project steps inside each $facet pipeline to format the data.

According to the docs, it should work :)