A document in my mongodb collection is as follows:
{
"transaction_id": "e10cc8d64204394cd35329a88dc4ab8f",
"timestamp": "2021-05-11 17:47:50",
"client_id": "[email protected]",
"ip": "172.69.34.31",
"level": "OK",
"request": {
"method": "POST",
"status_code": 200,
"status_name": "Ok",
"url": "/ocr/driver"
},
"bytes": 297,
"message": "Status authenticated successfully"
},
{
"transaction_id": "e10cc8d64204394cd35329a88dc4ab8f",
"timestamp": "2021-05-11 17:47:50",
"client_id": "[email protected]",
"ip": "172.69.34.31",
"level": "OK",
"request": {
"method": "POST",
"status_code": 200,
"status_name": "Ok",
"url": "/status/driver"
},
"bytes": 297,
"message": "Status authenticated successfully"
}......
I can get the total of each request.url by timestamp(daily-based) with the following query.
db.getCollection('collection').aggregate([
{
$match: {
client_id: { $regex : '^chae@'},
'request.url': { $regex : '^\/ocr'}}
},
{ '$group' :
{
'_id': { $substr: [ "$timestamp", 0, 10 ] },
count: { $sum: 1 }
}
},
{
'$sort':
{ 'count': -1}
}
])
then I get the result of request.url in which contains /^ocr/
_id | count
2021-10-01 | 10
2021-09-30 | 15
2021-09-29 | 11
....
It seems ok. However if I want to get a different request.url, I have to send another query. Can I do get the result of all different request.url at a time with one single query?
My desired outcome is something like this.
_id | count(ocr) | count(status) | count(something else)
2021-10-01 | 10 | 20 | 56
2021-09-30 | 15 | 26 | 28
2021-09-29 | 11 | 87 | 466
I even don't know if it's possible in mongodb? Please advise me. Thanks.
CodePudding user response:
You can use $facet to get multiple results like this:
If you are using any programming language you can easily create the object for the query to not repeat code.
db.collection.aggregate({
"$facet": {
"ocr": [
// your query here
],
"status": [
// your query using $regex : '^\/status'
],
"whatever":[
// your query using $regex : '^\/whatever'
]
}
})
Example here
CodePudding user response:
Query
- group accepts expression also, so we can calculate the group value
- here we have 2 group keys
- those that their email starts with chae and url with /ocr
- those that their email starts with chae and url with /status
- we sum those seperatly
- if you add more cases on the
$switch
you can do this for many different searches - if you dont know those cases when you write the query, you need to contruct this
{"$switch" ...}
dynamically you can do it with a loop, its easy to do, loop on the variables and create it - if you have an index on those fields, you can add a
$match
on top, like allow only
(or (chae and /ocr) (chae and /status) ...)
There is also a $facet
solution (facet = 1 aggregation per field)
- if index is used on the $match , $facet or $group similar perfomance (you do regex match so test the index use), but $facet simpler query
- if not index,
$facet
will do 1 collection scan per field, so go for the$group
i think
aggregate(
[{"$group":
{"_id":
{"reg":
{"$switch":
{"branches":
[{"case":
{"$and":
[{"$regexMatch": {"input": "$client_id", "regex": "^chae@"}},
{"$regexMatch": {"input": "$request.url", "regex": "^/ocr"}}]},
"then": "chae-ocr"},
{"case":
{"$and":
[{"$regexMatch": {"input": "$client_id", "regex": "^chae@"}},
{"$regexMatch":
{"input": "$request.url", "regex": "^/status"}}]},
"then": "chae-status"}],
"default": "other"}},
"time": {"$substrCP": ["$timestamp", 0, 10]}},
"count": {"$sum": 1}}},
{"$project": {"_id": "$_id.time", "count": 1, "searched": "$_id.reg"}}])