I have a data set (ex. 1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7) and I want to group items of the same value but only if they are next to each other minimum 3 times.
Is there a way? I've tried using combinations of Count and GroupBy and Select in every way I know but I can't find a right one.
Or if it can't be done with LINQ then maybe some other way?
CodePudding user response:
I don't think I'd strive for a 100% LINQ solution for this:
var r = new List<List<int>>() { new () { source.First() } };
foreach(var e in source.Skip(1)){
if(e == r.Last().Last()) r.Last().Add(e);
else r.Add(new(){ e });
}
return r.Where(l => l.Count > 2);
The .Last()
calls can be replaced with [^1]
if you like
Output like:
[
[2,2,2],
[6,6,6]
]
Aggregate can be pushed into doing the same thing; this is simply an accumulator (r
), an iteration (foreach
) and an op on the result Where
var result = source.Skip(1).Aggregate(
new List<List<int>>() { new List<int> { source.First() } },
(r,e) => {
if(e == r.Last().Last()) r.Last().Add(e);
else r.Add(new List<int>(){ e });
return r;
},
r => r.Where(l => l.Count > 2)
);
..but would you want to be the one to explain it to the new dev?
Another LINQy way would be to establish a counter that incremented by one each time the value in the source array changes compared to the pervious version, then group by this integer, and return only those groups 3 , but I don't like this so much because it's a bit "WTF"
var source = new[]{1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7};
int ctr = 0;
var result = source.Select(
(e,i) => new[]{ i==0 || e != source[i-1] ? ctr : ctr, e}
)
.GroupBy(
arr => arr[0],
arr => arr[1]
)
.Where(g => g.Count() > 2);
CodePudding user response:
If you're nostalgic and like stuff like the Obfuscated C code contest, you could solve it like this.
(No best practice claims included)
int[] n = {1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7};
var t = new int [n.Length][];
for (var i = 0; i < n.Length; i )
t[i] = new []{n[i], i == 0 ? 0 : n[i] == n[i - 1] ? t[i - 1][1] : t[i - 1][1] 1};
var r = t.GroupBy(x => x[1], x => x[0])
.Where(g => g.Count() > 2)
.SelectMany(g => g);
Console.WriteLine(string.Join(", ", r));
In the end Linq is likely not the best solution here. A simple for-loop with 1,2,3 additional loop-variables to track the "group index" and the last value makes likely more sense. Even if it's 2 lines more code written.
CodePudding user response:
I wouldn't use Linq just to use Linq.
I'd rather suggest using a simple for loop to loop over your input
array and populate the output
list. To keep track of which number is currently being repeated (if any), I'd use a variable (repeatedNumber
) that's initially set to null
.
In this approach, a number can only be assigned to repeatedNumber
if it fulfills the minimum requirement of repeated items. Hence, for your example input, repeatedNumber
would start at null
, then eventually be set to 2
, then be set to 6
, and then be reset to null
.
One perhaps good use of Linq here is to check if the minimum requirement of repeated items is fulfilled for a given item in input
, by checking the necessary consecutive items in input
:
input
.Skip(items up to and including current item)
.Take(minimum requirement of repeated items - 1)
.All(equal to current item)
I'll name this minimum requirement of repeated items repetitionRequirement
. (In your question post, repetitionRequirement
is 3
.)
The logic in the for loop goes a follows:
number = input[i]
- If
number
is equal torepeatedNumber
, it means that the previously repeated item continues being repeated- Add
number
tooutput
- Add
- Otherwise, if the minimum requirement of repeated items is fulfilled for
number
(i.e. if therepetitionRequirement - 1
items directly followingnumber
ininput
are all equal tonumber
), it means thatnumber
is the first instance of a new repeated item- Set
repeatedNumber
equal tonumber
- Add
number
tooutput
- Set
- Otherwise, if
repeatedNumber
has value, it means that the previously repeated item just ended its repetition- Set
repeatedNumber
tonull
- Set
Here is a suggested implementation:
(I'd suggest finding a more descriptive method name)
//using System.Collections.Generic;
//using System.Linq;
public static List<int> GetOutput(int[] input, int repetitionRequirement)
{
var consecutiveCount = repetitionRequirement - 1;
var output = new List<int>();
int? repeatedNumber = null;
for (var i = 0; i < input.Length; i )
{
var number = input[i];
if (number == repeatedNumber)
{
output.Add(number);
}
else if (i consecutiveCount < input.Length &&
input.Skip(i 1).Take(consecutiveCount).All(num => num == number))
{
repeatedNumber = number;
output.Add(number);
}
else if (repeatedNumber.HasValue)
{
repeatedNumber = null;
}
}
return output;
}
By calling it with your example input:
var dataSet = new[] { 1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7 };
var output = GetOutput(dataSet, 3);
you get the following output:
{ 2, 2, 2, 6, 6, 6 }
Example fiddle here.