Counting many-to-many field returns wrong value (Django)-CodePudding

I have a model class Student:

class Student(models.Model):
   ...

and model class Course:

class Course(models.Model)
   students = models.ManyToManyField(Student)

I now want to filter Course based on number of Students associated with a course. I tried:

Course.objects.annotate(student_count = Count('students'))

But for some reason, student_count is always returning one.

Let's say I create a course and add two students to it:

s1 = Student.objects.create()
s2 = Student.objects.create()
m1 = Course.objects.create()
m1.students.add(s1)
m1.students.add(s2)

print(Course.objects.all().first().students.count())
print(Course.objects.annotate(student_count = Count('students')).first().student_count

Prints

2
1

Why are these two values different? How can I filter courses based on the number of Students?

CodePudding user response：

I tested your scenario and the result is the same for two approach:

class SOTestCase(TestCase):
    def setUp(self):
        s1 = Student.objects.create()
        s2 = Student.objects.create()
        m1 = Course.objects.create()
        m1.students.add(s1)
        m1.students.add(s2)

    @override_settings(DEBUG=True)
    def test_query(self):
        c1 = Course.objects.all().first()
        c2 = Course.objects.annotate(student_count = Count('students')).first()

        n1 = c1.students.count()
        n2 = c2.student_count

        self.assertEqual(n1, 2)
        self.assertEqual(n2, 2)

Result is OK:

Ran 1 test in 0.017s

OK

Maybe do you have ordering? Try removing ordering:

        c1 = (
            Course
            .objects
            .order_by() #<-- this one
            .first())
        c2 = (
            Course
            .objects
            .order_by() #<-- this one
            .annotate(student_count = Count('students'))
            .first())

CodePudding user response：

Your problem is coming from annotation part. Every time that you add annotation part to your queryset, you also add a group_by to your query and if you don't add specific group_by to your queryset it automatically adds it for you (i.e. group_by "id"). Then your results become smaller and it'll remove duplicates from your query results (because of group_by). For instance look at these querysets:

1- Course.objects.all()[3].students.count()

2- Course.objects.annotate(student_count=Count("students"))[3].student_count

Although I tried to get third result from both of my querysets, the final result is different, because length of these two querysets are different (Course.objects.all() and Course.objects.annotate(student_count=Count("students")). So if you look at these queries which are related to our querysets:

1- {'sql': 'SELECT *, COUNT("students"."id") AS "student_count" FROM "course" LEFT OUTER JOIN "students" ON ("course"."id" = "students"."course_id") GROUP BY "course"."id" ORDER BY "course"."id" ASC LIMIT 1 OFFSET 3'}


2- {'sql': 'SELECT COUNT(*) AS "__count" FROM "students" WHERE "students"."course_id" = 4'}

you can see the group by part in second query. And also note that you should use second one for proper result (except for some situations that you know the exact index of your desired result in first queryset). Otherwise there are always situations which you might get different or wrong results because your first or last record might change in annotations.