Home > OS >  C# mongo paging with aggregates
C# mongo paging with aggregates

Time:12-30

I have a mongodb collection which has multiple students and each student has multiple records like this

[
  {
    "studentid": "stu-1234",
    "dept": "geog",
    "Status": 1,
    "CardSwipeTimestamp": "2021-11-25T10:50:00.5230694Z"
  },
  {
    "studentid": "stu-1234",
    "dept": "geog",
    "Status": 2,
    "CardSwipeTimestamp": "2021-11-25T11:50:00.5230694Z"
  },
  {
    "studentid": "stu-abc",
    "dept": "geog",
    "Status": 11,
    "CardSwipeTimestamp": "2021-11-25T09:15:00.5230694Z"
  },
  {
    "studentid": "stu-abc",
    "dept": "geog",
    "Status": 21,
    "CardSwipeTimestamp": "2021-11-25T11:30:00.5230694Z"
  }
]

I have an aggregate query running and fetching multiple records like this in C# Dotnet Core 3.1. The query gets the latest record of each student based on a list of student ids and department name, in this case it will get one record for sid=stu-abc and one for sid=stu-1234.

string [] sids   = { array of Student ids here };
string deptName = "math";
var pipeline = new BsonDocument[]
    {
        new BsonDocument("$match",
            new BsonDocument
            {
                {"studentid", new BsonDocument("$in",BsonArray.Create(sids))},
                {"dept",dept}
            }
        ),
            new BsonDocument("$sort",new BsonDocument("CardSwipeTimestamp", -1)),
                               
            new BsonDocument("$group",
                new BsonDocument{
                        { "_id",
                            new BsonDocument
                            {
                                { "studentid","$studentid" },
                                { "dept","$dept"}
                            }
                        },
                        { "Status",new BsonDocument("$first", "$Status")},
                        { "CardSwipeTimestamp",new BsonDocument("$first", "$CardSwipeTimestamp")}
                    }
                ),
       new BsonDocument("$project",
            new BsonDocument
            {
                { "_id", 0 },
                { "studentid", "$_id.studentid" },
                { "dept", "$_id.dept" },
                { "Status", "$Status" },
                { "CardSwipeTimestamp", "$CardSwipeTimestamp" }
            }
        ),
         new BsonDocument("$skip",0),
         new BsonDocument("$limit",3),
    };

collectionName.Aggregate<BsonDocument>(pipeline).ToList()

Assuming my collection has millions of entries with 1000s of student ids, how do I provide a way to get a paged list returned? I don't want to get all records and then use C# linq to page thru them. Can I send some page parameters to the pipeline above so I can get say 20 records at a time and move to next 20 records with an offset from the first record? Need some help in this.

EDIT

After applying skip and limit as above, I am only getting the date paged, but its not consistent. if I pass in skip as 0 and limit=1 it gets me 3 records, but when I page to the next page, sometimes I get a record I got in the previous page.

CodePudding user response:

After $sort you need to add these two: $skip and $limit. An example is below (of course you need to parameterize it).

new BsonDocument("$skip", 0},
new BsonDocument("$limit", 20},

This enables the server side paging. Skip/limit require sorting otherwise your results aren't deterministic.

Now this approach works, but is not fully optimal. To be fully optimal, you need to remember the last record of the last page and start from there as a skip skips records sequentially from the first match in the sorted set (this means a scan, and so is linear; remembering the document allows you to do a seek, which is O(1)). This is outside of the scope of your question, but here's a resource and another.

  • Related