I'm trying to build a toolkit of Func
classes built on the fuzzystrmatch
postgres extension.
For instance I have this wrapper which takes in an Expression and a search term and returns the levenshtein distance:
class Levenshtein(Func):
"""This function calculates the Levenshtein distance between two strings:"""
template = "%(function)s(%(expressions)s, '%(search_term)s')"
function = "levenshtein"
def __init__(self, expression, search_term, **extras):
super(Levenshtein, self).__init__(
expression,
search_term=search_term,
**extras
)
Called like this, using an F Expression
:
Author.objects.annotate(lev_dist=Levenshtein(F('name'),'JRR Tolkien').filter(lev_dist__lte=2)
However if the 'name'
field here is greater than 255 it throws an error:
Both source and target can be any non-null string, with a maximum of 255 characters.
I can truncate the name when I annotate using Substr
:
Author.objects.annotate(clipped_name=Substr(F('name'),1,250))
But I can't seem to figure out how to place that logic inside the func, which I'm placing inside an ExpressionWrapper
and setting the output_field
as per the docs:
class Levenshtein(Func):
"""This function calculates the Levenshtein distance between two strings:"""
template = "%(function)s(%(expressions)s, '%(search_term)s')"
function = "levenshtein"
def __init__(self, expression, search_term, **extras):
super(Levenshtein, self).__init__(
expression=ExpressionWrapper(Substr(expression, 1, 250), output_field=TextField()),
search_term=search_term,
**extras
)
CodePudding user response:
Although the docs don't make this super clear, just by experimenting it turns out the answer was to remove the extra definition of expression
and pass in the ExpressionWrapper
directly as the first argument:
class Levenshtein(Func):
"""This function calculates the Levenshtein distance between two strings:"""
template = "%(function)s(%(expressions)s, '%(search_term)s')"
function = "levenshtein"
def __init__(self, expression, search_term, **extras):
super(Levenshtein, self).__init__(
ExpressionWrapper(Substr(expression, 1, 250), output_field=TextField()),
search_term=search_term,
**extras
)