There is a parser for sports. It works in a circle. By some logic, sports events are added to the database. In each parsing cycle, a sports event can be added to the database, but it may not be added)
I need to get all sports events from the last two parsing cycles. But, if there is a sporting event in both cycles, then only from the last one. This is the problem. Sample documents:
{
"command1": "A",
"command2": "B",
"parseCount": 0
},
{
"command1": "A",
"command2": "B",
"parseCount": 1
},
{
"command1": "A",
"command2": "B",
"parseCount": 2
},
{
"command1": "C",
"command2": "D",
"parseCount": 1
},
{
"command1": "E",
"command2": "F",
"parseCount": 2
},
As a result, I should get the last 3 documents from the list above. The document also has fields: match time and ObjectId
https://mongoplayground.net/p/-9gz4zOnudW
CodePudding user response:
If I've understood correctly you can first $sort
and then get $first
object into $group
like this:
I've used $first: $$ROOT
but you can use $first: value
for each value from the object if you want.
This query:
- First sort by
parseCount
to get the higher value in first position. - Then
$group
by two conditions, getting the first object (as is ordered is the higher one) - And use
$project
to get output values you want.
db.collection.aggregate([
{
"$sort": {
"parseCount": -1
}
},
{
"$group": {
"_id": {
"command1": "$command1",
"command2": "$command2"
},
"object": {
"$first": "$$ROOT"
}
}
},
{
"$project": {
"_id": "$object._id",
"command1": "$object.command1",
"command2": "$object.command2",
"parseCount": "$object.parseCount"
}
}
])
Example here
CodePudding user response:
Query
- its like 2 queries but can become 1 with lookup
- facet could be used and do 2 groups but would be limited to max 16mb data in collection, the bellow solution hasn't this limitation
- finds the max parse for the collection using the lookup (mongodb auto-optimizes it, so the pipeline in lookup will run only 1 time not for each document of collection, at least this happened when i tested it in the past)
- we keep only the last 2 parses, for example if
max=3
we keepparseCount=3
andparseCount=2
, we also keep only parseCount>0, you had that filter on Playground, if you dont need it remove it. - group by
command1,command2
and keep only the maxparseCount
you said that we keep only the latest if we have more than 1 - project to restore the document structure,
matchTime
and_id
are kept also because you said you have those also
db.collection.aggregate([
{
"$lookup": {
"from": "collection",
"pipeline": [
{
"$group": {
"_id": null,
"maxParse": {
"$max": "$parseCount"
}
}
}
],
"as": "result"
}
},
{
"$set": {
"maxParses": {
"$let": {
"vars": {
"v0": {
"$arrayElemAt": [
"$result",
0
]
}
},
"in": "$$v0.maxParse"
}
}
}
},
{
"$unset": [
"result"
]
},
{
"$match": {
"$expr": {
"$and": [
{
"$gt": [
"$parseCount",
0
]
},
{
"$gte": [
"$parseCount",
{
"$subtract": [
"$maxParses",
1
]
}
]
}
]
}
}
},
{
"$group": {
"_id": {
"command1": "$command1",
"command2": "$command2"
},
"maxParseCount": {
"$max": {
"parseCount": "$parseCount",
"matchTime": "$matchTime",
"id": "$_id"
}
}
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
"$_id",
"$maxParseCount",
"$$ROOT"
]
}
}
},
{
"$project": {
"command1": 1,
"command2": 1,
"parseCount": 1,
"matchTime": 1,
"_id": "$id"
}
}
])
To explain what query does based on your data
- will find maxParse=2 (lookup does this)
- first document will filtered out, because 2,1 parses are only kept and it has 0
- the rest one will be grouped by command1,command2
"A","B" has 2 documents(
_id=2,_id=3
) but only_id=3
will pass because it has the max parseCount
[
{
"_id": 1,
"command1": "A",
"command2": "B",
"parseCount": 0,
"matchTime": 1
},
{
"_id": 2,
"command1": "A",
"command2": "B",
"parseCount": 1,
"matchTime": 2
},
{
"_id": 3,
"command1": "A",
"command2": "B",
"parseCount": 2,
"matchTime": 3
},
{
"_id": 4,
"command1": "C",
"command2": "D",
"parseCount": 1,
"matchTime": 4
},
{
"_id": 5,
"command1": "E",
"command2": "F",
"parseCount": 2,
"matchTime": 5
}
]
Results
[
{
"_id": 3,
"command1": "A",
"command2": "B",
"matchTime": 3,
"parseCount": 2
},
{
"_id": 4,
"command1": "C",
"command2": "D",
"matchTime": 4,
"parseCount": 1
},
{
"_id": 5,
"command1": "E",
"command2": "F",
"matchTime": 5,
"parseCount": 2
}
]