Home > other >  c# find start date and end date based on a list of dates?
c# find start date and end date based on a list of dates?

Time:09-28

I have a database table with over 200K records and a column containing a Date (NOT NULL). I am struggling to do a GroupBy Date since the database is massive the query takes soooo long to process (like 1 minute or so).

My Theory:

  • Get the list of all records from that table
  • From that list find the end date and the start date (basically the oldest date and the newest)
  • Then taking say like 20 dates to do the GroupBy on so the query will be done in a shorter set of records..

Here is my Model that I have to get the list:

registration.Select(c => new RegistrationViewModel()
{
    DateReference = c.DateReference,
    MinuteWorked = c.MinuteWorked,             
});
  • The DateReferenceis the database column that I have to work with...

I am not pretty sure how to cycle through my list getting the dates start and end without taking too long.

Any idea on how to do that?

EDIT:

var registrationList = await context.Registration 
  .Where(c => c.Status == StatusRegistration.Active) // getting all active registrations
  .ToRegistrationViewModel() // this is simply a select method
  .OrderBy(d => d.DateReference.Date) // this takes long
  .ToListAsync();

The GroupBy:

 var grpList = registrationList.GroupBy(x => x.DateReference.Date).ToList();

var tempList = new List<List<RegistrationViewModel>>();
foreach (var item in grpList)
{
   var selList = item.Select(c => new RegistrationViewModel()
   {
    RegistrationId = c.RegistrationId,
    DateReference = c.DateReference, 
    MinuteWorked = c.MinuteWorked,
   }).ToList();

   tempList.Add(selList);
}

This is my SQL table: registration table T-SQL

This is the ToRegistrationViewModel() function:

 return registration.Select(c => new RegistrationViewModel()
 {
   RegistrationId = c.RegistrationId,
   PeopleId = c.PeopleId,
   DateReference = c.DateReference,
   DateChange = c.DateChange,
   UserRef = c.UserRef,
   CommissionId = c.CommissionId,
   ActivityId = c.ActivityId,
   MinuteWorked = c.MinuteWorked,
   Activity = new ActivityViewModel()
     {
       Code = c.Activity.Code,
       Description = c.Activity.Description,
     },
     Commission = new CommissionViewModel()
     {
       Code = c.Commission.Code,
       Description = c.Commission.Description
     },
     People = new PeopleViewModel()
     {
       UserId = c.People.UserId,
       Code = c.People.Code,
       Name = c.People.Name,
       Surname = c.People.Surname,
       Active = c.People.Active
     }
});

CodePudding user response:

There are multiple potential problems here

Lack of indexes

Your query uses the Status and DateReference, and neither looks to have an index. If there are only a few active statuses a index on that column might suffice, otherwise you need a index on the date to speedup sorting. You might also consider a composite index that includes both columns. An appropriate index should solve the sorting issue.

Materializing the query

ToListAsync will trigger the execution of the sql query, making every subsequent operation run on the client. I would also be highly suspicious of ToRegistrationViewModel, I would try changing this to an anonymous type, and only convert to an actual type after the query has been materialized. Running things like sorting and grouping on the client is generally considered a bad idea, but you need to consider where the actual bottleneck is, optimizing the grouping will not help if the transfer of data takes most time.

Transferring data

Fetching a large number of rows will be slow, no matter what. The goal is usually to do as much filtering in the database as possible so you do not need to fetch so many rows. If you have to fetch a large amount of records you might use Pagination, i.e. combine OrderBy with Skip and Take to fetch smaller chunks of data. This will not save time overall, but can allow for things like progress and showing data continuously.

  • Related