Home > front end >  Mongodb pipeline on parse server document add pointer field with $lookup
Mongodb pipeline on parse server document add pointer field with $lookup

Time:09-03

To be honest I really know sql but I'm kind of new to mongodb noSql so I'm a bit lost. I have made a pipeline that's just working fine. The point was to group by day and mindmapId to count number of user viewed it and sum watching time and save it into a collection in order to make request on it after.

here's sample of data MindMap

{
  "_id": "Yg5uGI3Iy0",
  "data": {
    "id": "root",
    "topic": "Main topic",
    "expanded": true
  },
  "theme": "orange",
  "_p_author": "_User$zqPzSKD7EM",
   "_created_at": {
    "$date": {
      "$numberLong": "1658497264836"
    }
  },
  "_updated_at": {
    "$date": {
      "$numberLong": "1661334292749"
    }
  }
}

MindmapView

{
  "_id": "qWR6HVIcvT",
  "startViewDate": {
    "$date": {
      "$numberLong": "1658669095261"
    }
  },
  "_p_user": "_User$VnrxG9gABO",
  "_p_mindmap": "MindMap$Yg5uGI3Iy0",
  "_created_at": {
    "$date": {
      "$numberLong": "1658669095274"
    }
  },
  "_updated_at": {
    "$date": {
      "$numberLong": "1658669095274"
    }
  }
}

Pipeline

[{
 $group: {
  _id: {
   day: {
    $dateToString: {
     format: '%Y-%m-%d',
     date: '$startViewDate'
    }
   },
   mindmapId: {
    $substr: [
     '$_p_mindmap',
     8,
     -1
    ]
   }
  },
  watchTime: {
   $sum: {
    $dateDiff: {
     startDate: '$_created_at',
     endDate: '$_updated_at',
     unit: 'second'
    }
   }
  },
  uniqueCount: {
   $addToSet: '$_p_user'
  }
 }
}, {
 $project: {
  _id: 1,
  total: {
   $size: '$uniqueCount'
  },
  watchTime: {
   $sum: '$watchTime'
  }
 }
}]

pipeline results

[{
  "_id": {
    "day": "2022-08-01",
    "mindmapId": "oGCQDQmaNK"
  },
  "total": 1,
  "watchTime": 7
},{
  "_id": {
    "day": "2022-08-11",
    "mindmapId": "7YlZ6FPwiD"
  },
  "total": 1,
  "watchTime": 21
},{
  "_id": {
    "day": "2022-08-15",
    "mindmapId": "7YlZ6FPwiD"
  },
  "total": 1,
  "watchTime": 13
},{
  "_id": {
    "day": "2022-07-25",
    "mindmapId": "7YlZ6FPwiD"
  },
  "total": 1,
  "watchTime": 3
},{
  "_id": {
    "day": "2022-08-01",
    "mindmapId": "YXa8omyChc"
  },
  "total": 2,
  "watchTime": 1306837
},{
  "_id": {
    "day": "2022-07-25",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 7
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60
},{
  "_id": {
    "day": "2022-08-06",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 0
},{
  "_id": {
    "day": "2022-08-11",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 69
},{
  "_id": {
    "day": "2022-08-10",
    "mindmapId": "oGCQDQmaNK"
  },
  "total": 1,
  "watchTime": 4
},{
  "_id": {
    "day": "2022-08-15",
    "mindmapId": "Yg5uGI3Iy0"
  },
  "total": 1,
  "watchTime": 9
},
...
]

However to exploit this data faster I need to include the mindmap author inside the result collection. The point is to group by day and mindmapId to count number of user viewed it and sum watching time and get the mindmap author and save it into a collection.

To do that I need to use $lookup but the result is kind of messy and the lookup act like a full join in sql. I've tried so much combination before this post.

Here's what I have mainly tried

[{
 $group: {
  _id: {
   day: {
    $dateToString: {
     format: '%Y-%m-%d',
     date: '$startViewDate'
    }
   },
   mindmapId: {
    $substr: [
     '$_p_mindmap',
     8,
     -1
    ]
   }
  },
  watchTime: {
   $sum: {
    $dateDiff: {
     startDate: '$_created_at',
     endDate: '$_updated_at',
     unit: 'second'
    }
   }
  },
  uniqueCount: {
   $addToSet: '$_p_user'
  }
 }
}, {
 $lookup: {
  from: 'MindMap',
  localField: '_objectId',
  foreignField: '_id.mindmapId',
  as: 'tempMindmapPointer'
 }
}, {
 $unwind: '$tempMindmapPointer'
}, {
 $match: {
  'tempMindmapPointer._id': '_id.mindmapId'
 }
}, {
 $project: {
  _id: 1,
  total: {
   $size: '$uniqueCount'
  },
  watchTime: {
   $sum: '$watchTime'
  },
  author: {
   $substr: [
    '$tempMindmapPointer._p_author',
    6,
    -1
   ]
  }
 }
}]

the $match doesn't work here it make me have no results If I remove $match it act like a full join user list with mindmap id list which I don't want

[{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "zqPzSKD7EM"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "zqPzSKD7EM"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "zqPzSKD7EM"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "VnrxG9gABO"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "zqPzSKD7EM"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "x6kNvG2O0X"
},...
]

I have tried to switch localField: '_objectId' foreignField:'_id.mindmapId' values. I have also tried to make the lookup first and group by id{day,mindmapId,authorId} but I have never been able to make this compiling.

What could I do to make this request working ? I'm sure there is something to do with $match and $lookup

CodePudding user response:

If I understand you correctly (since you didn't add the requested result), the simple option is:

db.MindmapView.aggregate([
  {$group: {
      _id: {
        day: {$dateToString: {format: "%Y-%m-%d", date: "$startViewDate"}},
        mindmapId: {$substr: ["$_p_mindmap", 8, -1]}
      },
      watchTime: {
        $sum: {
          $dateDiff: {startDate: "$_created_at", endDate: "$_updated_at", unit: "second"}
        }
      },
      uniqueCount: {$addToSet: "$_p_user"}
    }
  },
  {$project: {_id: 1, total: {$size: "$uniqueCount"}, watchTime: 1}},
  {$lookup: {
      from: "MindMap",
      localField: "_id.mindmapId",
      foreignField: "_id",
      as: "author"
    }
  },
  {$set: {author: {$first: "$author._p_author"}}}
])

See how it works on the playground example.

There is another option that may be a little more efficient, which is using the '$lookup' with a pipeline, to bring only the author from the MindMap collection instead of bringing the entire document and then filter it. In this case the $lookup stage will be:

  {
    $lookup: {
      from: "MindMap",
      let: {id: "$_id.mindmapId"},
      pipeline: [
        {$match: {$expr: {$eq: ["$$id", "$_id"]}}},
        {$project: {_p_author: 1, _id: 0}}
      ],
      as: "author"
    }
  }
  • Related