Optimizing the c# code nested for loop using parallel-CodePudding

I need to optimize below code so it can execute faster, by means using more memory or parallel, currently it is taking 2 minutes to complete single record in Windows 10 64bit, 16GB RAM PC

data1 list array length = 1000
data2 list array length = 100000
data3 list array length = 100

for (int d1 = 0; d1 < data1.Count; d1  )
{
   if (data1[d1].status == 'UNMATCHED')
   {
      for (int d2 = 0; d2 < data2.Count; d2  )
      {
         if (data2[d2].status == 'UNMATCHED')
         {
            vMatched = false;
            for (int d3 = 0; d3 < data3.Count; d3  )
            {
                if (data3[d3].rule == "rule1")
                {
                  if (data1[d1].value == data2[d2].value)
                  {
                     data1[d1].status = 'MATCHED';
                     data1[d2].status = 'MATCHED';
                     vMatched = true;
                     break;
                  }    
                }
                else if (data3[d3].rule == "rule2")
                {
                   ...
                }
                else if (data3[d3].rule == "rule100")
                {
                   ...
                }
                 
            }
            if (vMatched)
              break;
         }
      }
   }
}

CodePudding user response：

You can avoid to start everytime the 2nd loop from 0. By keeping last index with "UNMATCHED" inside data2.

It should reduce the complexity.

In the worst case:

Now 1000 * 100000 * 100 iterations: 10000000000

New (1000 100000) * 100 iterations: 10100000

CodePudding user response：

First of all, for any kind of performance oriented programming, avoid using strings, use more appropriate types, like enum or bools, instead. Another recommendation is to profile your code, so you know what parts actually take time.

In the given example there is only one rule presented, so this loop could be eliminated by first checking if this rule exist and only then proceed with the matching.

This matching essentially pairs unmatched items with the same value. Whenever problems like this occur, the standard solution is some kind of search structure, like a dictionary, to get better than linear search time. For example

var data2Dictionary = data2.ToDictionary(d => Tuple.Create(d.value, d.status), d => d);

This should let you drastically decrease the time to find a item with a specific value and status. Keep in mind that the code above will throw in case multiple items share the same value & status, and that the dictionary key will not be updated if the item changes value or status.