I would like to boost scores of documents based on how "recent" a document is. I am trying to do this using a function_score
. Here is an example of me doing this on a field called updated_at
:
{
"function_score": {
"boost_mode": "sum",
"functions": [
{
"exp": {
"updated_at": {
"origin": "now",
"scale": "1h",
"decay": 0.01,
},
},
"weight": 1,
}
],
"query": query
},
}
I would expect documents close to the datetime now
will have a score closer to 1, and documents closer to scale
will have a score closer to decay
(as described in the docs). Therefore, I'm using the boost_mode
sum, to keep the original document scores, and increase depending on how close to now
the updated_at
value is. (Also, the query score is useful so I would rather add than multiply, which is the default).
To test this scenario, I create a document (A) that returns a query score of about 2. I then duplicate it (B) and modify the new document's updated_at
timestamp to be an hour in the past.
In this scenario, I would expect (A) to have a higher score and (B) to have a lower score. However, when I run this scenario, I get the exact opposite. (B) ends up with a score of 3 and (A) ends up with a score of 2.
What am I misunderstanding here to cause this to happen? And how would I modify my function score to do what I would like?
CodePudding user response:
This turned out to be a a timezone issue.
I ended up using the explain API to look at what was contributing to the score. When doing that, I noticed that the origin
set to now
was actually in a different timezone to the one I was setting in the documents.
I fixed this by manually providing a UTC timestamp in the elasticsearch query rather than using now
as the value.
(If there is a better way to do this, please let me know)