Home > other >  Remove Duplicates from Datatable with LINQ without keeping a duplicated entry at all
Remove Duplicates from Datatable with LINQ without keeping a duplicated entry at all

Time:07-06

I have a Datatable with several Columns which I want to remove all duplicates from like that

Dt1 = Dt1 .AsEnumerable().GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") }).Select(g => g.First()).CopyToDataTable();

However above code leaves one entry (the first one that is found) in the DataTable via the Select.First at the end of the LINQ code.

Is there a way to remove all duplicates and keep none?

Edit: Example what the code is doing now and what it should do.

Datatable with entries like that

Name Filesize Filename
One 50 Fileone
Two 50 Fileone
Three 50 Filetwo
Four 50 Filethree

Above LINQ will now remove Line 2 as Filename and Filesize are the same. However Line 1 will stay as the LINQ Code selects the first duplicated entry.

I want to have removed line 1 and line 2 from the Datatable.

CodePudding user response:

Dt1 = Dt1.AsEnumerable()
         .GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") })
         .Where(g => g.Count() == 1)
         .Select(g => g.First())
         .CopyToDataTable();

That will discard any groups with more than one item, then get the first (and only) item from the rest.

CodePudding user response:

Note: This was blindly typed here, so there might be some typos in the code.

Idea is, get the number of rows inside your DataTable, and go trough each of them, and do what you already did.

int NumOfItems = Dt1.AsEnumarable().ToList();

for(int i = 0; i < NumOfItems.Count; i  )
{
   Dt1 = Dt1 .AsEnumerable().GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") }).Select(g => g.First()).CopyToDataTable();
}
  • Related