Home > Software design >  How to make double $lookup aggregation with nested data on MongoDB?
How to make double $lookup aggregation with nested data on MongoDB?

Time:01-02

I have 3 models:

  1. Study
  2. WordSet
  3. Category

Study model has reference into WordSet, then WordSet has reference into Category.

I understand that for display data normally, I use populate. But in this situation, I need a query with many $lookup.

How I can 'populate' Category from WordSet and display only that category which was repeated the most?

I would achieve a response like this:

"stats": [
    {
        "_id": null,
        "numberOfStudies": 4,
        "averageStudyTime": 82.5,
        "allStudyTime": 330,
        "longestStudy": 120,
        "allLearnedWords": 8
        "hardestCategory": "Work" // only this field is missing
    }
]

I've tried to do it like this:

   const stats = await Study.aggregate([
  {
    // join User table 
    $lookup: {
      from: 'User',
      let: { userId: '$user' },
      pipeline: [
        {
          $match: { $expr: { $eq: ['$_id', '$$userId'] } },
        },
      ],
      as: 'currentUser',
    },
  },
  {
   // join WordSet table
    $lookup: {
      from: 'WordSet',
      let: { wordSetId: '$learnedWordSet' },
      pipeline: [
        {
          $match: { $expr: { $eq: ['$_id', '$$wordSetId'] } },
        },
        {
         // from this moment i'm not sure how to make it work
          $lookup: {
            from: 'Category',
            let: { categoryId: '$category' },
            pipeline: [
              {
                $match: { $expr: { $in: ['$_id', '$$categoryId'] } },
              },
            ],
            as: 'category',
          },
        },
      ],
      as: 'wordSet',
    },
  },
  { // add wordset with category? this is not working
    $addFields: {
      wordSet: {
        $arrayElemAt: ['$wordSet', 0],
      },
    },
  },
  { // search by logged user
    $match: { user: new ObjectID(currentUserId) },
  },
  { 
    $group: {
      // display statistics about user's studying
      _id: null,
      numberOfStudies: { $sum: 1 },
      averageStudyTime: { $avg: '$studyTime' },
      allStudyTime: { $sum: '$studyTime' },
      longestStudy: { $max: '$studyTime' },
      allLearnedWords: { $sum: { $size: '$learnedWords' } },
      // category: check which category is repeated the most and display it
    },
  },
]);

Study

     const studySchema = new mongoose.Schema({
  name: {
    type: String,
  },
  studyTime: {
    type: Number,
  },
  learnedWords: [String],
  notLearnedWords: [String],
  learnedWordSet: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'WordSet',
  },
  user: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'User',
  },
});

WordSet

const wordSetSchema = new mongoose.Schema({
      name: {
        type: String,
      },
      category: {
        type: [
          {
            type: mongoose.Schema.Types.ObjectId,
            ref: 'Category',
            required: true,
          },
        ],
      },
    });

Category

const categorySchema = new mongoose.Schema({
  name: {
    type: String,
  },
});

CodePudding user response:

I am not sure if i understand correctly, you can try the query and i have improved the usage of stages,

  • $match always try to use stage in the first stage
  • $lookup with User collection, there is no need to pipeline version, you can use localField and foreignField properties

I don't think is there any use of user document, and lookup stage because you want only statistics as per last $group stage. so you can skip this lookup stage

  • inside WordSet lookup,
    • $match your condition
    • $project to show required fields
    • $unwind deconstruct the category array
    • $group by category and get the total count
    • $sort by count in descending order
    • $limit to get only the first and single element that is most used
    • $llokup with Category collection
    • $project to show required fields, get first category name
  • $group stage, hardestCategory get $first category name
const stats = await Study.aggregate([
  { $match: { user: new ObjectID(currentUserId) } },
  {
    $lookup: {
      from: "User",
      localField: "user",
      foreignField: "_id",
      as: "currentUser"
    }
  },
  {
    $lookup: {
      from: "WordSet",
      let: { wordSetId: "$learnedWordSet" },
      pipeline: [
        { $match: { $expr: { $eq: ["$_id", "$$wordSetId"] } } },
        {
          $project: {
            _id: 0,
            category: 1
          }
        },
        { $unwind: "$category" },
        {
          $group: {
            _id: "$category",
            count: { $sum: 1 }
          }
        },
        { $sort: { count: -1 } },
        { $limit: 1 },
        {
          $lookup: {
            from: "Category",
            localField: "_id",
            foreignField: "_id",
            as: "category"
          }
        },
        {
          $project: {
            _id: 0,
            category: { $arrayElemAt: ["$category.name", 0] }
          }
        }
      ],
      as: "wordSet"
    }
  },
  {
    $group: {
      _id: null,
      numberOfStudies: { $sum: 1 },
      averageStudyTime: { $avg: "$studyTime" },
      allStudyTime: { $sum: "$studyTime" },
      longestStudy: { $max: "$studyTime" },
      allLearnedWords: {
        $sum: { $size: "$learnedWords" }
      },
      hardestCategory: {
        $first: {
          $first: "$wordSet.category"
        }
      }
    }
  }
])

Playground

  • Related