Home > OS >  how do add a custom scoring script in painless
how do add a custom scoring script in painless

Time:02-11

this language is not painless at all... zero examples and the docs are lacking...

I am trying to build a custom distance function between embeddings, I have done it in python:

def my_norm(x,y):
    norm_embeddings = [sum(a * a) ** 0.5 for a in x]
    norm_target = sum(y * y) ** 0.5
    z = y-x
    norm_top = [sum(a * a) ** 0.5 for a in z]
    return norm_top / (norm_embeddings norm_target)

where x is a N*m array and y is m vector

here is what I got in painless, don't even know if this will work...

def normalized_euclidean_dist(def x, def y){
  def norm_embeddings= new ArrayList();
  def norm_target = Math.pow((y*y).sum(), 0.5);
  def z = y-x;
  def norm_top=new ArrayList();
  for (a in x){
    norm_embeddings.add(Math.pow((a*a).sum(), 0.5))
  }
  for (a in z){
    norm_top.add(Math.pow((a*a).sum(), 0.5))
  }
  return norm_top/(norm_embeddings norm_target)
}

how do I call this fucntion on the script?

CodePudding user response:

its hard to debug it without actual docs/index mapping, but this is the general way to call script sorting

GET some_index/_search
{
  "query": {
    "match_all": {}
  },
  "sort": {
    "_script": {
      "type": "number",
      "script": {
        "lang": "painless",
        "source": """
        float normalized_euclidean_dist(def x, def y){
          def norm_embeddings= new ArrayList();
          def norm_target = Math.pow((y*y).sum(), 0.5);
          def z = y-x;
          def norm_top=new ArrayList();
          for (a in x){
            norm_embeddings.add(Math.pow((a*a).sum(), 0.5));
          }
          for (a in z){
            norm_top.add(Math.pow((a*a).sum(), 0.5));
          }
          return norm_top/(norm_embeddings norm_target);
        }
        
        return normalized_euclidean_dist(doc['x_field'], doc['y_field']);
        """
      },
      "order": "asc"
    }
  }
}

Replace 'x_field' and 'y_field' with the actual field names, as far as I understand both of them are arrays, but if not, you need to add doc['x_field'].value

Now sorting scripts need to return a number as result, but I may miss something in your script but looks to me like its trying to divide an array buy a number / another array, which wouldn't work so there is work to do there as well

script sorting docs: https://www.elastic.co/guide/en/elasticsearch/reference/current/sort-search-results.html#script-based-sorting

  • Related