I am trying to go through a IEnumerable in order to check for duplicates and remove them in order as in: a method that takes a IEnumerable and returns a IEnumerable and filters out doubles like in the example: AAABBAACCC becomes ABAC In order to accomplish this I need a way to query the IEnumerable and compare each item to the next one and if they are different then add them to a new IEnumerable and return that, or compare each item to the next one and if they are the same then remove the first item. I came up with something like this:
public static IEnumerable<T> UniqueInOrder<T>(IEnumerable<T> iterable)
{
var query = from c in iterable
where c != c.Next
select c;
return query;
}
The problem is that c.Next does not exist. Is there a way to do this ? or is it not possible with Linq?
CodePudding user response:
One way you can use in LINQ to "get the next thing" is Zip
. Specifically, you Zip
with the same enumerable, but with the first element skipped.
Your query would translate to:
public static IEnumerable<T> UniqueInOrder<T>(IEnumerable<T> iterable)
=> iterable.Zip(iterable.Skip(1)).Where(x => !x.First.Equals(x.Second)).Select(x => x.First);
However, your logic here is flawed. The output for "AAABBAACCC" is "ABA", because the last "group" has no "next" thing that is different. If you limit T
to all reference types, you could add a null
at the end of iterable.Skip(1)
:
public static IEnumerable<T> UniqueInOrder<T>(IEnumerable<T> iterable)
where T: class
=> iterable.Zip(iterable.Skip(1).Append(null))
.Where(x => !x.First.Equals(x.Second)).Select(x => x.First);
This works because nothing Equals
null, hence guaranteeing always keeping the last "group".
I would do this as a loop, checking the previous item.
public static IEnumerable<T> UniqueInOrder<T>(IEnumerable<T> iterable) {
if (!iterable.Any()) {
yield break;
}
T lastSeen = iterable.First();
yield return lastSeen;
foreach (var t in iterable) {
if (!lastSeen.Equals(t)) {
yield return t;
}
lastSeen = t;
}
}
Side note: you can't use !=
with an unconstrained generic parameter T
, so I changed it to !Equals
, but bear in mind that this won't work with nulls.
CodePudding user response:
using System;
using System.Linq;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var items = new List<string>(){"A","A","A","B","B","A","A","C","C","C"};
foreach(var item in UniqueInOrder(items))
{
Console.Write("{0} ", item);
}
Console.WriteLine();
// Oder
foreach(var item in UniqueInOrderLinq(items))
{
Console.Write("{0} ", item);
}
Console.WriteLine();
}
public static IEnumerable<T> UniqueInOrder<T>( IEnumerable<T> input ) where T: IEquatable<T>
{
if( input is null) throw new ArgumentNullException(nameof(input));
T prev = input.First();
yield return prev;
foreach( T item in input.Skip(1) )
{
if(!item.Equals(prev))
{
yield return item;
prev = item;
}
}
}
public static IEnumerable<T> UniqueInOrderLinq<T>( IEnumerable<T> input ) where T: IEquatable<T>
{
if( input is null) throw new ArgumentNullException(nameof(input));
return input.Aggregate( new List<T>(), (acc, next) => {if(!acc.LastOrDefault()?.Equals(next) ?? true) {acc.Add(next);} return acc; });
}
}
Answer from @Fildor . Thanks mate, works perfectly!
CodePudding user response:
To improve on your existing answer, we can do this without querying the source multiple times:
public static IEnumerable<T> UniqueInOrder<T>(IEnumerable<T> input, IEqualityComparer<T> comparer = null)
{
if(input is null) throw new ArgumentNullException(nameof(input));
comparer = comparer ?? EqualityComparer<T>.Default;
var isFirst = true;
T prev = default;
foreach( T item in input)
{
if(isFirst)
{
yield return item;
prev = item;
isFirst = false;
}
else if(!comparer.Equals(item, prev))
{
yield return item;
prev = item;
}
}
}