I want to find which are the top 3 places(if it has) that a user comments most.
More specificle i have a collection review which has the user_id and the business_id and a collection business which has bussines_id and longitude and latitude. If the user has made a review from the location (38.551126, -110.880452) and (38.999999, -110.000000), we can say the user has 2 reviews at this location (38, -110).
collection.review
{
"review_id": "KU_O5udG6zpxOg-VcAEodg",
"user_id": "mh_-eMZ6K5RLWhZyISBhwA",
"business_id": "XQfwVwDr-v0ZS3_CbbE5Xw",
"stars": 3,
"useful": 0,
"funny": 0,
"cool": 0,
"text": "If you decide to eat here, ...",
"date": "2018-07-07 22:09:11"
}
collection.business
{
"business_id": "XQfwVwDr-v0ZS3_CbbE5Xw",
"name": "Turning Point of North Wales",
"address": "1460 Bethlehem Pike",
"city": "North Wales",
"state": "PA",
"postal_code": "19454",
"latitude": 40.2101961875,
"longitude": -75.2236385919,
"stars": 3,
"review_count": 169,
"is_open": 1,
"attributes": Object
"categories": "Restaurants, Breakfast & Brunch, Food",
"hours": Object
}
I am a beginner at mongodb, the only pipeline i have some result is this.
review.aggregate([{
$match: {
user_id: {
$in: [some_list]
}
}
}, {
$lookup: {
from: 'business',
localField: 'business_id',
foreignField: 'business_id',
as: 'business'
}
}, {
$unwind: {
path: '$business'
}
}, {
$group: {
_id: '$user_id',
coordinates: {
$push: {
latitude: {
$toInt: '$business.latitude'
},
longitude: {
$toInt: '$business.longitude'
}
}
},
places: {
$sum: 1
}
}
}])
output
_id: "xoZvMJPDW6Q9pDAXI0e_Ww"
coordinates: Array
0: Object
latitude: 39
longitude: -119
1: Object
latitude: 39
longitude: -119
2: Object
latitude: 39
longitude: -119
3: Object
latitude: 39
longitude: -119
4: Object
latitude: 39
longitude: -119
places: 5
After that i process the result in python. Is there someway to do it from the pipeline immediately and have the result something like this (and limited to top 3)
_id: "xoZvMJPDW6Q9pDAXI0e_Ww"
top_places: Array
0: Object
latitude: 39
longitude: -119
1: Object
latitude: 23
longitude: 56
CodePudding user response:
You just need to add an additional grouping stage, first group by location and user and only then group by user, like so:
db.review.aggregate([
{
$lookup: {
from: "business",
localField: "business_id",
foreignField: "business_id",
as: "business"
}
},
{
$unwind: {
path: "$business"
}
},
{
$group: {
_id: {
user: "$user_id",
latitude: {
$toInt: "$business.latitude"
},
longitude: {
$toInt: "$business.longitude"
}
},
places: {
$sum: 1
}
}
},
{
$group: {
_id: "$_id.user",
coordinates: {
$push: {
latitude: "$_id.latitude",
longitude: "$_id.longitude",
}
},
places: {
$sum: "$places"
}
}
}
])