Home > Enterprise >  LINQ Except not returning string values correctly
LINQ Except not returning string values correctly

Time:06-18

I've been trying to create a small program to compare two collections of strings and to output any items that are different or missing between collection1 and collection2.

As far as I have been able to determine LINQ's .Except method should provide my desired outcome but it seems to fall short.

The files which I'm trying to compare are the ACL files produced by using ICACLS to save all the permissions on a given directory and its subdirectories by running the following command from cmd:

icacls c:\TestHash /save c:\aclFile.txt /t

I run this command twice on a directory to produce two icacls files which I want to ensure match and if they don't then I want to output where they don't match.

Here is some sample code for trying to compare these items

  //Load in user permissions files
            var list1 = File.ReadAllLines(dbFilePath1, Encoding.Unicode);
            var list2 = File.ReadAllLines(dbFilePath2, Encoding.Unicode);
  

Once the items were loaded I'd run the .Except command

  //Identify all differences between the two collections and output to new collection
            var list3 = list1.Except(list2);
            var list4 = list2.Except(list1);

However, here is an example below of 10 records from file 1 in which I remove one permissions record from file 2 but .Except doesn't identify that it's missing the item

First file:

TestHash
D:AI(A;OICIID;FA;;;BC)(A;OICIID;FD;;;SY)(A;OICIID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)(A;OICIIOID;SDGXGWGR;;;AU)
TestHash\testFile1.csv
D:AI(A;ID;FA;;;BD)(A;ID;FR;;;SY)(A;ID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)
TestHash\testFile2.csv
D:AI(A;ID;FA;;;BD)(A;ID;FR;;;SY)(A;ID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)
TestHash\testFile3.csv
D:AI(A;ID;FA;;;BD)(A;ID;FR;;;SY)(A;ID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)
TestHash\testFile4.csv
D:AI(A;ID;FA;;;BD)(A;ID;FR;;;SY)(A;ID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)

Second file:

TestHash
D:AI(A;OICIID;FA;;;BC)(A;OICIID;FD;;;SY)(A;OICIID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)(A;OICIIOID;SDGXGWGR;;;AU)
TestHash\testFile1.csv
D:AI(A;ID;FA;;;BD)(A;ID;FR;;;SY)(A;ID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)
TestHash\testFile2.csv
D:AI(A;ID;FA;;;BD)(A;ID;FR;;;SY)(A;ID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)
TestHash\testFile3.csv
TestHash\testFile4.csv
D:AI(A;ID;FA;;;BD)(A;ID;FR;;;SY)(A;ID;0x1200a9;;;BU)(A;ID;0x1307by;;;AU)

As described above the second file is missing one permissions record but .Except doesn't recognise this, the collections are being processed as an enumeration of strings and the default equality comparer should be able to detect this difference as far as I can understand, I'm aware you can override this with a custom comparer but I'm not sure on what the implementation would be for this.

An additional note is that this only seems to throw up issues with any of the permissions strings themselves, .Except does seem able to determine any missing strings when it's one of the folder/file names, so I'm thinking it may be getting confused because there are many identical permissions strings within the collection so it may think it's got a matching item even though it's not got a specific record relating to a specific file.

I expect this will need some sort of custom override but I'm not sure what this implementation would be.

Any thoughts would be much appreciated, thanks for taking the time to read this through.

CodePudding user response:

I'm not sure if my assumptions are correct, but it looks like the format of the file generated is two lines per directory entry and that the first line is the full path to the file and hence must be unique.

If that is correct, then the second list can not contain the line TestHash\testFile3.csv.

If so then you can group the directory entry with the permission and then use except to check for differences.

To do this, I first add a line number to each line and then group by every two lines and the create an anonymous object with the first and second entries in each group,

eg

var groupedList1 = list1
    .Select((val , index) => new { val, index })
    .GroupBy(g => g.index / 2)
    .Select(r => r.ToArray())
    .Select(r => new { DirectoryEntry = r[0].val , OldPermission = r[1].val , NewPermission = ""}) ;

Because the DirectoryEntry Name is unique, we know that each entry in our grouped list must be unique and hence the except operator will operator as you want.

Or you could combine groupedList1 and groupedList2 like

var allEntries = groupedList1.Select(a=>a.DirectoryEntry).Union(
                 groupedList2.Select(a=>a.DirectoryEntry));

var combined = (from r in allEntries select new 
    { 
       DirectoryEntry =  r , 
       OldPermission = groupedList1.SingleOrDefault(a=>a.DirectoryEntry == r)?.Permission , 
       NewPermission = groupedList2.SingleOrDefault(a=>a.DirectoryEntry == r)?.Permission 
    }
    )
    .Where(a=>a.OldPermission != a.NewPermission);  
  • Related