I am writing a few lines of code to read an XML file, get a collection of elements, add them to a list then check for duplicates. Should be simple but I can't get it working.
Here is the XML that I read (or an extract of for simplicity). Note the first and third entries are the same, these are what I want to identify:
<pmEntry>
<dmRef>
<dmRefIdent>
<dmCode modelCode="CRAFT123" systemCode="B" anotherCode="63" infoCode="010" />
<issueInfo issueNumber="001" inWork="00" />
</dmRefIdent>
<dmRefAddressItems>
<dmTitle>
<techName>My data</techName>
<infoName>General data</infoName>
</dmTitle>
</dmRefAddressItems>
</dmRef>
<dmRef>
<dmRefIdent>
<dmCode modelCode="CRAFT789" systemCode="B" anotherCode="50" infoCode="500" />
<issueInfo issueNumber="001" inWork="00" />
</dmRefIdent>
<dmRefAddressItems>
<dmTitle>
<techName>Some other data</techName>
<infoName>Technical data</infoName>
</dmTitle>
</dmRefAddressItems>
</dmRef>
<dmRef>
<dmRefIdent>
<dmCode modelCode="CRAFT123" systemCode="B" anotherCode="63" infoCode="010" />
<issueInfo issueNumber="001" inWork="00" />
</dmRefIdent>
<dmRefAddressItems>
<dmTitle>
<techName>My data</techName>
<infoName>General data</infoName>
</dmTitle>
</dmRefAddressItems>
</dmRef>
</pmEntry>
Here is the method, into which gets passed a file path of the XML file.
private void CheckPMforDuplicates(string path)
{
XDocument doc = XDocument.Load(path);
List<XElement> DMList = new List<XElement>();
var DMs = doc.Descendants("dmRefIdent");
if (DMs != null)
{
foreach (var dm in DMs)
{
DMList.Add(dm);
}
var duplicates = DMList
.GroupBy(i => i.Element("dmCode"))
.Where(g => g.Elements("dmCode").Count() > 1)
.Select(g => g.Key);
if (duplicates != null)
{
string duplicateDMstring = "";
foreach (var dup in duplicates)
{
duplicateDMstring = duplicateDMstring ",\r\n " dup;
}
if(duplicateDMstring == "")
{
MessageBox.Show("No duplicates");
}
else
{
MessageBox.Show("Duplicates are " duplicateDMstring);
}
}
}
}
If I change the Linq query to look for a count of equal to 1 (i.e "== 1") it presents me with a nice list of elements in the message box as expected. But for some reason it will not find duplicates.
It's clearly a Linq problem, but I can't get it working to display the two duplicate entries.
CodePudding user response:
It's not a LINQ problems, it's the Equality problem. You group by Element("dmCode")
, which is a XElement
- a reference type, so by default GroupBy
will compare references. To actually compare the contents of an element, use i.Element("dmCode").Value
or i.Element("dmCode").ToString()
instead:
var duplicates = DMList
.GroupBy(i => i.Element("dmCode").ToString())
.Where(g => g.Elements("dmCode").Count() > 1)
.Select(g => g.Key);
or provide your own IEqualityComparer
.