Home > other >  How to use Substr with an F Expression
How to use Substr with an F Expression

Time:12-21

I'm trying to build a toolkit of Func classes built on the fuzzystrmatch postgres extension.

For instance I have this wrapper which takes in an Expression and a search term and returns the levenshtein distance:

class Levenshtein(Func):
    """This function calculates the Levenshtein distance between two strings:"""
    template = "%(function)s(%(expressions)s, '%(search_term)s')"
    function = "levenshtein"

    def __init__(self, expression, search_term, **extras):
        super(Levenshtein, self).__init__(
            expression,
            search_term=search_term,
            **extras
        ) 

Called like this, using an F Expression:

Author.objects.annotate(lev_dist=Levenshtein(F('name'),'JRR Tolkien').filter(lev_dist__lte=2)

However if the 'name' field here is greater than 255 it throws an error:

Both source and target can be any non-null string, with a maximum of 255 characters.

I can truncate the name when I annotate using Substr:

Author.objects.annotate(clipped_name=Substr(F('name'),1,250))

But I can't seem to figure out how to place that logic inside the func, which I'm placing inside an ExpressionWrapper and setting the output_field as per the docs:

class Levenshtein(Func):
    """This function calculates the Levenshtein distance between two strings:"""
    template = "%(function)s(%(expressions)s, '%(search_term)s')"
    function = "levenshtein"

    def __init__(self, expression, search_term, **extras):
        super(Levenshtein, self).__init__(
            expression=ExpressionWrapper(Substr(expression, 1, 250), output_field=TextField()),
            search_term=search_term,
            **extras
        ) 

CodePudding user response:

Although the docs don't make this super clear, just by experimenting it turns out the answer was to remove the extra definition of expression and pass in the ExpressionWrapper directly as the first argument:

class Levenshtein(Func):
    """This function calculates the Levenshtein distance between two strings:"""
    template = "%(function)s(%(expressions)s, '%(search_term)s')"
    function = "levenshtein"

    def __init__(self, expression, search_term, **extras):
        super(Levenshtein, self).__init__(
            ExpressionWrapper(Substr(expression, 1, 250), output_field=TextField()),
            search_term=search_term,
            **extras
        ) 
  • Related