I'm a beginner c# coder and I have wrote a small code which counts how many different characters are in a string and returns a List. I wanted to make this code much more efficient.
string s1 = "noob";
List<int> count = new();
foreach(var item in s1.Distinct()){
count.Add(s1.Count(x => x == item));
}
The end result is exactly what I want, a List of int with values of 1,2,1.
However I'm fully aware that this is really inefficient (since I have to iterate through the string at each character). I've tried using the LINQ GroupBy method but I was unable to assign it to either a list or to an int[] for some reason. (I don't know how to convert from IGrouping to array..)
My logic was in theory simple. Group all characters and for each group return how many of those characters are present and add that value to the array.
It pisses me off to no end that I'm unable to figure this one out on my own. I welcome any tips or guidance on this matter and thank you very much in advance.
CodePudding user response:
You can use GroupBy
instead of Distinct
(we group equal charactes, select Count
from each group and, finally, materialize as a list):
string s1 = "noob";
List<int> list = s1
.GroupBy(c => c)
.Select(group => group.Count())
.ToList(); // ToArray(); if you want an array
Let's have a look:
Console.Write(string.Join(", ", list));
Outcome:
1, 2, 1
CodePudding user response:
My logic was in theory simple. Group all characters and for each group return how many of those characters are present and add that value to the array.
Good logic
It pisses me off to no end that I'm unable to figure this one out on my own.
Dmitry's given a good answer, but let's talk a bit about GroupBy, because it's confusing as ****
If you're used to SQL GROUP BY, this isn't much like that - it "stops half way"
In SQL a GROUP BY insists you have something you group on and for anything you dond't group, you have to supply some aggregation. Here's all the employee counts by department, and the max salary in the department:
SELECT department, COUNT(*), MAX(salary) FROM emp GROUP BY department
dept, name, salary
tech, jon, 100000
tech, jane, 120000
sales, tim, 90000
--results
tech, 2, 12000
sales, 1, 90000
LINQ doesn't do this when it groups; it runs the grouping but it gives you all the input data divided up into the groups and doesn't insist you do any aggregates
The same thing there but done by LINQ is:
employees.GroupBy(e => e.Department)
[
{ Key = "tech", self = [ { tech/jon/100000 }, { tech/jane/120000} ] },
{ Key = "tech", self = [ { sales/tim/90000 } ] }
]
That's some pseudo-json to describe what you get from a GroupBy. You get a list of groupings. A grouping is something "like a list", that has a Key property, and can be enumerated. If you ask for the Key
you get what was grouped on (the string of the department name). If you enumerate the grouping, you get all the employees that have that Key
var groups = employees.GroupBy(e => e.Department);
foreach(var group in groups){
Console.WriteLine("Now looking at the contents of the " g.Key " group");
foreach(var employee in group)
Console.WriteLine(" It's " employee.Name);
}
Prints:
Now looking at the contents of the Tech group It's Jon It's Jane Now looking at the contents of the Sales group It's Tim
Because this list of things is.. well.. an enumerable list, you can call other LINQ methods on it
foreach(var group in groups){
Console.WriteLine("Now looking at the contents of the " g.Key " group");
Console.WriteLine(" The count is " group.Count());
Console.WriteLine(" The max salary is " group.Max(employee => employee .Salary));
}
Now looking at the contents of the Tech group The count is 2 The max salary is 120000 Now looking at the contents of the Sales group The count is 1 The max salary is 90000
By stopping half way and not forcing you to do aggregates, LINQ leaves itself open for really useful stuff
In short, GroupBy is simply a device that takes a single list:
Aaron
Albert
Bertie
Ben
Charlie
And turns it into a list of lists:
GroupBy(name => name[0]) //first char
A: [Aaron, Albert]
B: [Bertie, Ben]
C: [Charlie]
Just remember, that to avoid confusing yourself, name your variables well in your lambdas
list_of_names
.GroupBy(name => name[0])
.Select(list_of_names_with_same_key => list_of_names_with_same_key.Count())
Remind yourself that what you're selecting is a list within a list, You might want to not use a name like list_of_names_with_same_key
for that; I use g
for "grouping" and I remember that a grouping is some enumerable list with a Key property that defines some value common to all
It's not so useful in this example, but also note that GroupBy has more overloads that the basic one but the docs are really tough to read. It's quite common to use the overload that lets you choose a different value..
//produces something like a List<string department, List<Employee>>
employees.GroupBy(emp => emp.Department)
//produces something like a List<string department, List<int>>
employees.GroupBy(emp => emp.Department, emp => emp.Salary)
The second one produces just salaries in the list of grouped items, which makes it easier to e.g.:
Console.WriteLine(" The max salary is " g.Max()); //g is a list of ints of salaies, not full employees
And finally, it might help demystify SelectMany too - SelectMany is basically the opposite of GroupBy. If GroupBy turns a 1 dimensional list into a 2 dimensional thing,. SelectMany turns a 2D thing into a 1D thing. The thing you give to SelectMany should be a list within a list, and you get a single list back out (outerlist.SelectMany(someInnerList)
--> all the inner lists concatted together).
You could literally:
employees.GroupBy(emp => emp.Department).SelectMany(g => g)
and end up back where you started
CodePudding user response:
var counts = "noob".GroupBy(c => c, (_, cs) => cs.Count());
Console.WriteLine("{0}", string.Join(",", counts)); // prints '1,2,1'