I have a database with Products and each product has an Id, Name, ManufacturerId, CategoryId and UserScore.
I want to retrieve all Products by a given Category sorted by UserScore, but avoiding many products of same Manufacturer listed together.
With the following query they all stuck together:
SELECT
P.ProductId, P.Name, P.ManufacturerId, P.UserScore
FROM Products P
WHERE P.CategoryId = 1
ORDER BY P.UserScore
This is the result in T-SQL
In T-SQL I came up with a solution like the following, where Products are grouped in no more than 2 elements by Manufacturer, and it suits perfectly my needs:
SELECT T.*
FROM (
SELECT
P.ProductId, P.Name, P.ManufacturerId, P.UserScore,
ROW_NUMBER() OVER (PARTITION BY P.ManufacturerId ORDER BY P.UserScore DESC) RN
FROM Products P
WHERE P.CategoryId = 1
) T
ORDER BY T.UserScore / CEILING(RN/2.0) DESC
How could I implement a ElasticSearch Query to mimic this behaviour?
Any ideas?
The index in elasticsearch would be like this, this is just an abstract example:
{"ProductId": "157072", "Name": "Product 157072", "ManufacturerId": "7790", "UserScore": "100000", "CategoryId": "1"},
{"ProductId": "296881", "Name": "Product 296881", "ManufacturerId": "6921", "UserScore": "35400", "CategoryId": "1"},
{"ProductId": "353924", "Name": "Product 353924", "ManufacturerId": "54616", "UserScore": "25000", "CategoryId": "1"},
...
CodePudding user response:
You can use the collapse search function to group all the manufacturers:
https://www.elastic.co/guide/en/elasticsearch/reference/current/collapse-search-results.html
Visit "inner_hits" to control the collapsed results behavior.
# Indexing Documents
POST test_so/_bulk
{ "index" : {} }
{"ProductId": "157072", "Name": "Product 157072", "ManufacturerId": "7790", "UserScore": 100000, "CategoryId": "1"}
{ "index" : {} }
{"ProductId": "296881", "Name": "Product 296881", "ManufacturerId": "6921", "UserScore": 35400, "CategoryId": "1"}
{ "index" : {} }
{"ProductId": "353924", "Name": "Product 353924", "ManufacturerId": "54616", "UserScore": 25000, "CategoryId": "1"}
# Filtering by Category: 1, collapsing by Manufacturer and sorting by UserScore
POST test_so/_search
{
"query": {
"term": {
"CategoryId.keyword": {
"value": "1"
}
}
},
"collapse": {
"field": "ManufacturerId.keyword"
},
"sort": [
{
"UserScore": {
"order": "desc"
}
}
]
}
Results
{
"took": 22,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "test_so",
"_id": "0amBPYQBJRm5qR4vd6NE",
"_score": null,
"_source": {
"ProductId": "157072",
"Name": "Product 157072",
"ManufacturerId": "7790",
"UserScore": 100000,
"CategoryId": "1"
},
"fields": {
"ManufacturerId.keyword": [
"7790"
]
},
"sort": [
100000
]
},
{
"_index": "test_so",
"_id": "0qmBPYQBJRm5qR4vd6NE",
"_score": null,
"_source": {
"ProductId": "296881",
"Name": "Product 296881",
"ManufacturerId": "6921",
"UserScore": 35400,
"CategoryId": "1"
},
"fields": {
"ManufacturerId.keyword": [
"6921"
]
},
"sort": [
35400
]
},
{
"_index": "test_so",
"_id": "06mBPYQBJRm5qR4vd6NE",
"_score": null,
"_source": {
"ProductId": "353924",
"Name": "Product 353924",
"ManufacturerId": "54616",
"UserScore": 25000,
"CategoryId": "1"
},
"fields": {
"ManufacturerId.keyword": [
"54616"
]
},
"sort": [
25000
]
}
]
}
}
CodePudding user response:
Try following which is assuming all items in Group have same values. So I used First()
var results = products.Where(x => x.CategoryId == 1)
.OrderByDescending(x => x.UserScore)
.GroupBy(x => x.ManufacturerId)
.Select(x => new {ProductId = x.ProductId.First(), Name = x.Name.First(), ManufacturerId = x.Key, UserScore = x.UserScore.First()})