Home > Back-end >  Saving via Entity Framework takes a lot of resources
Saving via Entity Framework takes a lot of resources

Time:07-20

I'm using Sql Server Db conntected to my ASP.NET Mvc Application via Entity Framework. Application is deployed on Virtual Machine, which is managing by Cluster. Curently max memory is setup to 32 GB. I could up it to 64 GB. One of avaliable solution for users is loading csv files to database. Every single file is one day, which is consisting of 80-90k records with 200 columns. For this solution I used this setup:

public ActionResult AddRecordsFromCSVFile2(FileInput fileInputs)
        {
            BWSERWISEntities bWDiagnosticEntities2 = new BWSERWISEntities();
            bWDiagnosticEntities2.Configuration.AutoDetectChangesEnabled = true;
            bWDiagnosticEntities2.Configuration.ValidateOnSaveEnabled = false;

            var Files = fileInputs.File;
            if (ModelValidate())
            {
                foreach (var fileInput in Files)
                {
                    List<VOLTER_LOGS> LogsToAdd = new List<VOLTER_LOGS>();
                    using (MemoryStream memoryStream = new MemoryStream())
                    {
                        fileInput.InputStream.CopyTo(memoryStream);
                        memoryStream.Seek(0, SeekOrigin.Begin);
                        {
                            using (var streamReader = new StreamReader(memoryStream))
                            {
                                var csvConfig = new CsvConfiguration(CultureInfo.InvariantCulture)
                                {
                                    Delimiter = ";",
                                    HasHeaderRecord = false
                                };
                                using (var csvReader = new CsvReader(streamReader, csvConfig))
                                {
                                    var records = csvReader.GetRecords<ReadAllRecords>().ToList();
                                    var firstRecord = records.Where(x => x.Parametr4 != "0" && x.Parametr5 != "0" && x.Parametr6 != "0").FirstOrDefault();
                                    var StartDateToCheck = Convert.ToDateTime(firstRecord.Parametr4   "-"   firstRecord.Parametr5   "-"   firstRecord.Parametr6   " "   firstRecord.Parametr3   ":"   firstRecord.Parametr2   ":"   firstRecord.Parametr1, new CultureInfo("en-GB", true));
                                    var lastRecord = records.Where(x => x.Parametr4 != "0" && x.Parametr5 != "0" && x.Parametr6 != "0").LastOrDefault();
                                    var EndDateToCheck = Convert.ToDateTime(lastRecord.Parametr4   "-"   lastRecord.Parametr5   "-"   lastRecord.Parametr6   " "   lastRecord.Parametr3   ":"   lastRecord.Parametr2   ":"   lastRecord.Parametr1, new CultureInfo("en-GB", true));
                                    int matchingMachineID = int.Parse(firstRecord.Parametr7);
                                    var matchingElements = bWDiagnosticEntities2.VOLTER_LOGS.Where(x => (x.Parametr7 == matchingMachineID) && (x.Parametr1 >= StartDateToCheck && x.Parametr1 <= EndDateToCheck)).ToList();
                                    bWDiagnosticEntities2.VOLTER_LOGS.RemoveRange(matchingElements);
                                    bWDiagnosticEntities2.SaveChanges();                                
                                    foreach (var record in records)
                                    {
                                        if (record.Parametr4 != "0" && record.Parametr5 != "0" && record.Parametr6 != "0")
                                        {
                                            bWDiagnosticEntities2.Configuration.AutoDetectChangesEnabled = false;
                                            string date = record.Parametr4   "-"   record.Parametr5   "-"   record.Parametr6   " "   record.Parametr3   ":"   record.Parametr2   ":"   record.Parametr1;
                                            DateTime recordDate = DateTime.Parse(date, new CultureInfo("en-GB", true));
                                            // DateTime recordDate = DateTime.ParseExact(date, "dd-MM-yyyy HH:mm:ss", new CultureInfo("en-GB"));
                                            VOLTER_LOGS vOLTER_LOGS = new VOLTER_LOGS();
                                            vOLTER_LOGS.Parametr1 = recordDate;
                                            vOLTER_LOGS.Parametr2 = 0;
                                            vOLTER_LOGS.Parametr3 = 0;
                                            vOLTER_LOGS.Parametr4 = 0;
                                            vOLTER_LOGS.Parametr5 = 0;
                                            vOLTER_LOGS.Parametr6 = 0;
                                            vOLTER_LOGS.Parametr7 = int.Parse(record.Parametr7);

7-200...
                                            vOLTER_LOGS.Parametr200 = int.Parse(record.Parametr200);

                                            LogsToAdd.Add(vOLTER_LOGS);
                                           
                                        }
                                    }
                                }
                            }
                        }
                        bWDiagnosticEntities2.VOLTER_LOGS.AddRange(LogsToAdd);
                        bWDiagnosticEntities2.SaveChanges();
                    }
                }
            }    
            return View();
        }

This implementation takes 5-8 mins and taking like 45-55% of VM's memory, but during the save process sometimes is difficult to get response of other simple request (get first and last date) with error:

The request limited has expired or server is not responding.

Server has a lot of free memory, but table has a 7kk records already and still growing up. Finalny propably files won't be save during the other users requests. Now i'm wondering that is it good idea to use Entity Framework for that data volume. The main goal of Application is to detect errors in data volume over a period of time. Does anyone happen to have experience with such amount of data processing via Entity Framework? Is any solution to split resources to every single task, so that every request could get response?

CodePudding user response:

EF is an ORM, not an import tool. An ORM's job is to load graphs of related entities and give the impression of working with objects instead of tables and SQL. There are no entities or graphs in import jobs though, there are files, fields, tables and mappings.

The fastest way to import data SQL Server tables in .NET is to use the SqlBulkCopy class. This uses the same mechanism used in the bcp tool or the BULK INSERT command to import data as a stream, using minimal logging. The input can be a DataTable or an IDataReader instance.

Libraries like CsvHelper or ExcelDataReader provide IDataReader-derived classes out of the box. In other cases, FastMember's ObjectReader can be used to create an IDataReader wrapper over any IEnumerable.

Loading a CSV into a table with CsvHelper and SqlBulkCopy could be as simple as this, provided the file and table columns match:

public async Task SimpleImport(string path, string table, 
                               string connectionString)
{
    using var reader = new StreamReader(path);
    var csvConfig = new CsvConfiguration(CultureInfo.InvariantCulture)
        {
            Delimiter = ";"
        };
    using var csv = new CsvReader( reader, csvConfig);

    using var dr = new CsvDataReader(csv);
    using var bcp = new SqlBulkCopy(connectionString);
    bcp.DestinationTableName = table;

    await bcp.WriteToServerAsync(dr);
}

If there are no headers, they'll have to be provided through CsvHelper's configuration, eg through GetDynamicPropertyName or a class map :

string[] fields=new[] {.....};
csvConfig.GetDynamicPropertyName = args=> {
    if (args.FieldIndex<10)
    {
        return fields[args.FieldIndex];
    }
    else
    {
        return $"Value_{args.FieldIndex}";
    }
};

Using a ClassMap allows specifying type conversions, including complex ones

public class Foo
{
    ...
    public int Year { get; set; }
    public int Month { get; set; }
    public int Day { get; set; }
    public DateTime Date { get; set; }
}

public class FooMap : ClassMap<Foo>
{
    public FooMap()
    {
        AutoMap(CultureInfo.InvariantCulture);

        Map(m => m.Date).Convert(row => DateFromParts(row.GetField(5),row.GetField(6),row.GetField(7));
    }

    DateTime DateFromParts(string year, string month, string say)
    {
        return DateTime.Parse($"{year}-{month}-{day}");
    }
}
  • Related