Home > Enterprise >  how to do inner join,left outerjoin between collections in C#
how to do inner join,left outerjoin between collections in C#

Time:10-18

I have two collection with more number of items like a million. I want to compare them to find out the below in C#

  1. how to find the matching items between Orders1 and Orders2
  2. how to find the Orders1 which are in Orders2
  3. how to find the Orders2 which are in Orders1
  4. how to find the Orders1 which are not in Orders2
  5. how to find the Orders2 which are not in Orders1
public class Order
 {
     public int OrderID { get; set; }       
     public string OrderDate { get; set; }
     public string StoreID { get; set; }
     public float TotalPrice { get; set; }
 }
 
static void Main(string[] args)
{
    List<Order> Orders1 = new List<Order>();
    Orders1.Add(new Order { OrderID = 1, StoreID = "A01", TotalPrice = 12.1F });
    Orders1.Add(new Order { OrderID = 2, StoreID = "A02", TotalPrice = 12.2F });
    Orders1.Add(new Order { OrderID = 3, StoreID = "A03", TotalPrice = 12.3F });
    Orders1.Add(new Order { OrderID = 4, StoreID = "A04", TotalPrice = 12.4F });
    Orders1.Add(new Order { OrderID = 1, StoreID = "A01", TotalPrice = 12.1F });
    Orders1.Add(new Order { OrderID = 5, StoreID = "A05", TotalPrice = 12.5F });


    List<Order> Orders2 = new List<Order>();
    Orders2.Add(new Order { OrderID = 1, StoreID = "A01", TotalPrice = 12.1F });
    Orders2.Add(new Order { OrderID = 2, StoreID = "A02", TotalPrice = 12.2F });
    Orders2.Add(new Order { OrderID = 2, StoreID = "A03", TotalPrice = 12.2F });
    Orders2.Add(new Order { OrderID = 1, StoreID = "A01", TotalPrice = 12.1F });
    Orders2.Add(new Order { OrderID = 5, StoreID = "A05", TotalPrice = 12.5F });
    Orders2.Add(new Order { OrderID = 6, StoreID = "A06", TotalPrice = 12.6F });
    Orders2.Add(new Order { OrderID = 7, StoreID = "A07", TotalPrice = 12.7F });
}

CodePudding user response:

   var result = from x in Orders1
                       join y in Orders2 on x.OrderID equals y.OrderID
                       select x; //1


   Orders1.Where(x=>Orders2.Any(y=>y.OrderID == x.OrderID )); //2

   Orders2.Where(x=>Orders1.Any(y=>y.OrderID == x.OrderID )); //3

   Orders1.Where(p => Orders2.All(p2 => p2.OrderID != p.OrderID)); //4

   Orders2.Where(p => Orders1.All(p2 => p2.OrderID != p.OrderID)); //5

CodePudding user response:

Is performance important? Please keep in mind that you can check all this in O(mn), where m and n are the sizes of Orders1 and Orders2, respectively.

  1. If performance is not that important, then go for a simple 2 cycles approach:
  • Have 5 lists (one per each one of your cases), and classify your orders accordingly.
  1. If performance IS an important factor, then:

In both scenarios, you should (this is not a must if you have enough memory) sort both lists by some criteria (probably OrderID, again IComparable could help), and when you classify the orders, delete them both Order lists, so you don't end up having 2 lists of millions 5 lists of classified orders. Again, learn memory management.

Based on my personal experience, don't go for lambda expressions. If you claim you have more than a million entries in each list, you might run into memory issues.

My personal suggestion: Learn to implement IComparable and create your own insert. I think the lesson of this exercise is to learn how to attack this type of problems when you have millions of rows of data.

CodePudding user response:

Take a look at C# join clause. contains full documentation of how to implement

  • Inner join
  • Group join
  • Left outer join

and then manipulate your queries with Join to extract the data.

var intersecting = from o1 in Orders1
join o2 in Orders2 on o1.OrderID equals o2.OrderID
select o1;
  • Related