Count the number of different characters in a string and return an int array-CodePudding

I'm a beginner c# coder and I have wrote a small code which counts how many different characters are in a string and returns a List. I wanted to make this code much more efficient.

string s1 = "noob";
List<int> count = new();
foreach(var item in s1.Distinct()){
    count.Add(s1.Count(x => x == item));
}

The end result is exactly what I want, a List of int with values of 1,2,1.

However I'm fully aware that this is really inefficient (since I have to iterate through the string at each character). I've tried using the LINQ GroupBy method but I was unable to assign it to either a list or to an int[] for some reason. (I don't know how to convert from IGrouping to array..)

My logic was in theory simple. Group all characters and for each group return how many of those characters are present and add that value to the array.

It pisses me off to no end that I'm unable to figure this one out on my own. I welcome any tips or guidance on this matter and thank you very much in advance.

CodePudding user response：

You can use GroupBy instead of Distinct (we group equal charactes, select Count from each group and, finally, materialize as a list):

string s1 = "noob";

List<int> list = s1
  .GroupBy(c => c)
  .Select(group => group.Count())
  .ToList(); // ToArray(); if you want an array

Let's have a look:

Console.Write(string.Join(", ", list));

Outcome:

1, 2, 1

CodePudding user response：

My logic was in theory simple. Group all characters and for each group return how many of those characters are present and add that value to the array.

Good logic

It pisses me off to no end that I'm unable to figure this one out on my own.

Dmitry's given a good answer, but let's talk a bit about GroupBy, because it's confusing as ****

If you're used to SQL GROUP BY, this isn't much like that - it "stops half way"

In SQL a GROUP BY insists you have something you group on and for anything you dond't group, you have to supply some aggregation. Here's all the employee counts by department, and the max salary in the department:

SELECT department, COUNT(*), MAX(salary) FROM emp GROUP BY department

dept, name, salary
tech, jon, 100000
tech, jane, 120000
sales, tim, 90000

--results
tech, 2, 12000
sales, 1, 90000

LINQ doesn't do this when it groups; it runs the grouping but it gives you all the input data divided up into the groups and doesn't insist you do any aggregates

The same thing there but done by LINQ is:

employees.GroupBy(e => e.Department)

[
  { Key = "tech", self = [ { tech/jon/100000 }, { tech/jane/120000} ] },
  { Key = "tech", self = [ { sales/tim/90000 } ] }
]

That's some pseudo-json to describe what you get from a GroupBy. You get a list of groupings. A grouping is something "like a list", that has a Key property, and can be enumerated. If you ask for the Key you get what was grouped on (the string of the department name). If you enumerate the grouping, you get all the employees that have that Key

var groups = employees.GroupBy(e => e.Department);

foreach(var group in groups){
  Console.WriteLine("Now looking at the contents of the "   g.Key   " group");

  foreach(var employee in group)
    Console.WriteLine("  It's " employee.Name);
}

Prints:

Now looking at the contents of the Tech group It's Jon It's Jane Now looking at the contents of the Sales group It's Tim

Because this list of things is.. well.. an enumerable list, you can call other LINQ methods on it

foreach(var group in groups){
  Console.WriteLine("Now looking at the contents of the "   g.Key   " group");

  Console.WriteLine("  The count is "   group.Count());
  Console.WriteLine("  The max salary is "   group.Max(employee => employee .Salary));
  
}

Now looking at the contents of the Tech group The count is 2 The max salary is 120000 Now looking at the contents of the Sales group The count is 1 The max salary is 90000

By stopping half way and not forcing you to do aggregates, LINQ leaves itself open for really useful stuff

In short, GroupBy is simply a device that takes a single list:

Aaron
Albert
Bertie
Ben
Charlie

And turns it into a list of lists:

GroupBy(name => name[0]) //first char

A: [Aaron, Albert]
B: [Bertie, Ben]
C: [Charlie]

Just remember, that to avoid confusing yourself, name your variables well in your lambdas

list_of_names
  .GroupBy(name => name[0])
  .Select(list_of_names_with_same_key => list_of_names_with_same_key.Count())

Remind yourself that what you're selecting is a list within a list, You might want to not use a name like list_of_names_with_same_key for that; I use g for "grouping" and I remember that a grouping is some enumerable list with a Key property that defines some value common to all

It's not so useful in this example, but also note that GroupBy has more overloads that the basic one but the docs are really tough to read. It's quite common to use the overload that lets you choose a different value..

//produces something like a List<string department, List<Employee>> 
employees.GroupBy(emp => emp.Department) 

//produces something like a List<string department, List<int>> 
employees.GroupBy(emp => emp.Department, emp => emp.Salary)

The second one produces just salaries in the list of grouped items, which makes it easier to e.g.:

Console.WriteLine("  The max salary is "   g.Max()); //g is a list of ints of salaies, not full employees

And finally, it might help demystify SelectMany too - SelectMany is basically the opposite of GroupBy. If GroupBy turns a 1 dimensional list into a 2 dimensional thing,. SelectMany turns a 2D thing into a 1D thing. The thing you give to SelectMany should be a list within a list, and you get a single list back out (outerlist.SelectMany(someInnerList) --> all the inner lists concatted together).

You could literally:

employees.GroupBy(emp => emp.Department).SelectMany(g => g)

and end up back where you started

CodePudding user response：

var counts = "noob".GroupBy(c => c, (_, cs) => cs.Count());
Console.WriteLine("{0}", string.Join(",", counts)); // prints '1,2,1'