I have a collection which includes documents with this structure:
{
car: carId,
seller: sellerId,
buyer: buyerId,
timestamp: timestamp
}
I want to extract unique new(first time owned a car) car owners count by monthly.
So expected result would be like:
[
{
date: 2022-01-01T00:00:00.000 00:00,
newCarOwners: 28
},
{
date: 2022-02-01T00:00:00.000 00:00,
newCarOwners: 18
},
{
date: 2022-03-01T00:00:00.000 00:00,
newCarOwners: 10
}
...
]
I think I can achieve it by grouping documents by month and carId, then find owners (last transaction's buyerId for each carId) by month. Then exclude earlier months'(cumulative) owners.
$setWindowFields
might be useful but I couldn't find a way to exclude earlier months' owners.
Right now my pipeline looks like this:
[
{
$set: {
month: {
$dateTrunc: {
date: {
$toDate: '$timestamp'
},
unit: 'month',
binSize: 1,
timezone: 'GMT',
}
}
}
}, {
$group: {
_id: {
month: '$month',
carId: '$carId'
},
doc: {
$max: {
ts: '$timestamp',
buyer: '$buyer'
}
}
}
}, {
$group: {
_id: {
month: '$_id.month'
},
owners: {
$addToSet: '$doc.buyer'
}
}
}
]
I'm open for any idea/solution, it could be some other way than mine too.
Thanks.
CodePudding user response:
db.collection.aggregate([
{
$group: {
_id: "$buyer",
firstTime: {
$min: "$timestamp"
},
count: { $sum: 1 }
}
},
{
$match: { count: 1 }
},
{
$group: {
_id: {
$dateTrunc: {
date: { $toDate: "$firstTime" },
unit: "month",
binSize: 1,
timezone: "GMT"
}
},
newCarOwners: { $sum: 1 }
}
}
])