I have a Datatable
with several Columns which I want to remove all duplicates from like that
Dt1 = Dt1 .AsEnumerable().GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") }).Select(g => g.First()).CopyToDataTable();
However above code leaves one entry (the first one that is found) in the DataTable via the Select.First
at the end of the LINQ code.
Is there a way to remove all duplicates and keep none?
Edit: Example what the code is doing now and what it should do.
Datatable with entries like that
Name | Filesize | Filename |
---|---|---|
One | 50 | Fileone |
Two | 50 | Fileone |
Three | 50 | Filetwo |
Four | 50 | Filethree |
Above LINQ will now remove Line 2 as Filename and Filesize are the same. However Line 1 will stay as the LINQ Code selects the first duplicated entry.
I want to have removed line 1 and line 2 from the Datatable.
CodePudding user response:
Dt1 = Dt1.AsEnumerable()
.GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") })
.Where(g => g.Count() == 1)
.Select(g => g.First())
.CopyToDataTable();
That will discard any groups with more than one item, then get the first (and only) item from the rest.
CodePudding user response:
Note: This was blindly typed here, so there might be some typos in the code.
Idea is, get the number of rows inside your DataTable
, and go trough each of them, and do what you already did.
int NumOfItems = Dt1.AsEnumarable().ToList();
for(int i = 0; i < NumOfItems.Count; i )
{
Dt1 = Dt1 .AsEnumerable().GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") }).Select(g => g.First()).CopyToDataTable();
}