Home > database >  LINQ query in which I can select multiple colums ? - LINQ
LINQ query in which I can select multiple colums ? - LINQ

Time:10-18

Trying to Select multiple multiple columns after group by :

Two given Lists:

        List<Unit> unitList = new List<Unit>()
        {
        new Unit{ UnitId = 10, UnitName = "ETA"},
        new Unit{ UnitId = 20, UnitName = "FSI"},
         new Unit{ UnitId = 30, UnitName = "ECS"}
        };

        List<Employee> employeeList = new List<Employee>()
        {
         new Employee {EmployeeName = "John",UnitId=10,ProjectCode="ETAMYS",Salary=30000,JobLevel=3,JoiningDate=new DateTime(2014,3,5)},
         new Employee {EmployeeName = "Jack",UnitId=10,ProjectCode="ETACHN",Salary=35000,JobLevel=3,JoiningDate=new DateTime(2011,3,5)},
         new Employee {EmployeeName = "Albus",UnitId=10,ProjectCode="ETACHN",Salary=15000,JobLevel=4,JoiningDate=new DateTime(2011,3,5)},
         new Employee {EmployeeName = "Ron",UnitId=20,ProjectCode="FSIAUS",Salary=10000,JobLevel=4,JoiningDate=new DateTime(2007,2,5)},
         new Employee {EmployeeName = "Iwa",UnitId=20,ProjectCode="FSIAUS",Salary=15000,JobLevel=4,JoiningDate=new DateTime(2007,2,5)},
         new Employee {EmployeeName = "Albert",UnitId=30,ProjectCode=null,Salary=20000,JobLevel=3,JoiningDate=new DateTime(2005,1,5)}
        };

Required result: Display the EmployeeName, UnitId, Salary and ProjectCode of those employees who draw minimum salary in each unit.

What I have tried:

var query15 = (from emp in employeeList
                       group emp by emp.UnitId
                       into g
                       select new 
                       {
                           empUnits = g.Key,
                           empNames = g.Select(s=> s.EmployeeName),
                           empSalaries = g.Min(s => s.Salary),
                           projectCodes = g.Select(s=> s.ProjectCode)
                       }).ToList();           

        foreach (var unit in query15)
        {
            Console.WriteLine(unit.empNames   " "   unit.empUnits   " "   unit.empSalaries   " "   unit.projectCodes);
        }

Mine is pinting the minimum salary as well as corresponding employee units correctly, but its not printing the rest two columns empNames and projectCodes correctly as you can see in the below result:

enter image description here

I basically am new to LINQ and trying to get my hands on on various assignments and got stuck in the ones where the requirement asks for various coloums to be printed after the usage of group by. Thanks for the help in advance!

CodePudding user response:

Short form: use string.Join() to turn your collections into strings you can display.

OK, now for the in-depth version. Lecture mode engaged.


The result of the .Select() method is a SelectEnumerator<> instance that holds the following data:

  • The selector (s => s.EmployeeName or s => s.ProjectCode).

  • A reference to the source enumeration (g in this case).

    Which itself is an Grouping<TKey, TElement> instance (Grouping<int, Employee>) which holds:

    • The key value (UnitId value for the group).
    • A collection of references to grouped instances (Employee objects).

All of that is so that your code iterate through the names and project codes on demand... which is not just a lot of overhead in this case, it's also needlessly - and confusingly - dynamic. If you change the EmployeeName or ProjectCode for any of your employee records then enumerate query15's objects again, that change will be reflected. But if you change the UnitId on any of them the change will not be reflected, because the grouping is already completed.

Normally this isn't a problem as long as you're expecting it, but it's not immediately obvious from the code that this is the case.

You can simplify the whole mess by converting the Select enumerations to concrete collections like this:

var query15 = 
(
    from emp in employeeList
    group emp by emp.UnitId
    into g
    select new 
    {
        empUnits = g.Key,
        empNames = g.Select(s=> s.EmployeeName).ToArray(),
        empSalaries = g.Min(s => s.Salary),
        projectCodes = g.Select(s=> s.ProjectCode).ToArray()
    }
).ToList();

With your sample data that will result in double-ups in the projectCodes array, so perhaps change that line to:

       projectCodes = g.Select(s => s.ProjectCode).Distinct().ToArray()

Now you have a collection of concrete objects with no deferred execution, no captured references and so on. It's an object graph that isn't going to mysteriously change on you, and it takes less time to work with the empNames and projectCodes collections in future since the Select (and Distinct if you added that) don't have to be processed again.


The next issue is the display. When you add strings to objects (string concatenation) the compiler has to call ToString() on the objects to get a string representation. For the vast majority of types this will result in some version of the type's name rather than some best-guess attempt at representing the content of the object.

To get a useful string you'll need to do a little work yourself. Fortunately the string.Join() method was specifically created for this kind of scenario:

Console.WriteLine
(
    string.Join(", ", unit.empNames)   " "  
    unit.empUnits   " "  
    unit.empSalaries   " "  
    string.Join(", ", unit.projectCodes)
);

Of course you could leave your original query15 as is and use string.Join() on the enumerations, as long as you're happy with the outcome. At that point you might as well defer enumeration of the entire query15 rather than making it partially concrete with ToList().

(Partially concrete because the objects it contains are themselves dynamic and complex.)

The moral of the story? C# and LINQ are great tools, but if you don't understand what's happening under the hood it will mess you up.

CodePudding user response:

Now that I've (over-)answered the obvious part of the question, here's the bit I missed:

Required result: Display the EmployeeName, UnitId, Salary and ProjectCode of those employees who draw minimum salary in each unit.

That puts a whole other spin on things, doesn't it.

What we need then is to group the employees by unit, find the lowest salary value for each unit and select any/all employees who match that salary. There could be multiple, so we'll just list out all of the employees whose salary matches the lowest salary for each unit.

Here's how I'd do that (with comments that I generally wouldn't write when not trying to explain code to others):

var query15 = 
(
    // group our employee list by UnitId
    from emp in employeeList
    group emp by emp.UnitId into grp

    // Find the lowest salary value
    let minSalary = grp.Min(e => e.Salary)

    // Now filter the grouped employees by that salary
    from emp in grp
    where emp.Salary == minSalary
    select emp
);

From there you can just print out the deails in a simple foreach loop. I like string interpolation for this:

foreach (var emp in query15)
{
    Console.WriteLine($"{emp.EmployeeName}\t{emp.UnitId}\t{emp.Salary}\t{emp.ProjectCode}");
}

Alternatively, and because I like pretty things, let's print a header for each unit with the unit details (id, name, min salary):

var query15a = 
(
    from emp in employeeList
    join unit in unitList on emp.UnitId equals unit.UnitId

    group emp by unit into grp

    let minSalary = grp.Min(e => e.Salary())

    select new 
    { 
        grp.Key.UnitId,
        grp.Key.UnitName,
        minSalary, 
        employees = grp.ToArray()
    }
);

foreach (var unit in query15a)
{
    Console.WriteLine($"Unit: {unit.UnitId} - {unit.UnitName}, Salary: ${unit.minSalary}");
    foreach (var emp in unit.employees)
        Console.WriteLine($"\t{emp.Name,-40}  {emp.ProjectCode}";
    Console.WriteLine();
}

Much prettier... if your employee names are always lesss than 40 characters. Adjust as required :P

CodePudding user response:

I'm not sure whether this led to any performance issues. However, for your question It can be achieve from this. You can add the necessary attributes to the selected object.

var foo = employeeList.GroupBy(g =>  g.UnitId).Select(s => 
   employeeList.Where(ss => ss.UnitId == s.Key).Select(s => new Employee {
     Salary = s.Salary,
     UnitId = s.UnitId 
    }).OrderBy(s => s.Salary).FirstOrDefault());

    foreach (var fo in foo)
    {
        Console.WriteLine(fo.Salary   "|"   fo.UnitId);
    }
  • Related