Home > Enterprise >  Queryset for a sql query
Queryset for a sql query

Time:11-20

See Filtering unique values for the problem description, sample data and postgres query. I'd like to convert the SQL to a queryset. I feel like I'm close but not quite.

SELECT Column_A, Column_B, Column_C, 0 as RN
FROM TABLE
WHERE COLUMN_C is null and Column_B in (UserA, UserB, UserC)

UNION ALL

SELECT Column_A, Column_B, Column_C, RN
FROM (
  SELECT A.*, ROW_NUMBER() over (partition by A.column_C Order by case A.column_B when 'UserA' then 0 else 1 end, U.Time_Created) rn
  FROM Table A
  INNER JOIN user U
    on U.Column_B = A.Column_B
  WHERE A.Column_C is not null and ColumnB in (userA, userB, UserC)) B
WHERE RN = 1

This is what I have so far:

qs1 = Table.objects.filter(Column_C__isnull=True).annotate(rn=Value(0))

qs2 = Table.objects.annotate(rn=Window(
   expression=RowNumber(),
   partition_by=[Column_C],
   order_by=[Case(When(Column_B=UserA, then=0), default=1), 'Table_for_Column_B__time_created']
)).filter(Column_C__isnull=False, rn=1)

return qs2.union(qs1)

This doesn't quite work.

django.db.utils.NotSupportedError: Window is disallowed in the filter clause.

Next, I tried pulling the intermediate result in a subquery, to allow for filtering in the outer query, since I only really need rows with row number = 1.

qs1 = Table.objects.filter(Column_C__isnull=True).annotate(rn=Value(0))

qs2 = Table.objects.annotate(rn=Window(
   expression=RowNumber(),
   partition_by=[Column_C],
   order_by=[Case(When(Column_B=UserA, then=0), default=1), 'Table_for_Column_B__time_created']
)).filter(pk=OuterRef('pk'))

qs3 = Table.objects.annotate(rn=Subquery(qs2.values('rn'))).filter(Column_C__isnull=False, rn=1)

return qs3.union(q1)

No exceptions this time, but this doesn't work. Every row in the table gets row_number=1 annotated. From the original example, the queryset returns all 7 rows instead of filtering to 5.

  1. Is it possible to filter on window expressions?
  2. What's the best practices to keep in mind when converting window queries to subqueries?
  3. Is there a better way to structure the queryset?

CodePudding user response:

You should be able to do this without a window expression, using a SubQuery

First create a queryset for the subquery that orders by the Column_B=UserA match and then time_created

from django.db.models import Case, When, Q, Subquery, OuterRef

tables_ordered = Table.objects.filter(
    Column_C=OuterRef('Column_C')
).annotate(
    user_match=Case(When(Column_B=UserA, then=0), default=1)
).order_by('user_match', 'time_created')

Then this subquery returns the first pk for the matched Column_C from the OuterRef, similar to selecting the first row from your window function

first_pk_for_each_column_c = Subquery(tables_ordered.values('pk')[:1])

Then use two Q objects to create an OR that selects the row if Column_C is NULL or the pk matches the first pk from the subquery

Table.objects.filter(
    Q(Column_C__isnull=True) | Q(pk=first_pk_for_each_column_c)
)
  • Related