I'm trying to get into multithreading by trying to do matrix multiplication and my problem is, how I would get all sub matrixes from a matrix.
My matrix variable is a int[,]
.
Example, if I have a matrix by 100 x 100, how would i get 10 of 10 x 10 sub matrix. And is it possible that user can choose to how many equal parts to cut up the matrix even if I the matrix is not a square ex. 400 x 300?
Is it even the right way to do it, by calculate on the sub matrixes and then add them together when done?
CodePudding user response:
how would i get 10 of 10 x 10 sub matrix
You would do a double loop, copying each value from the original matrix to the new sub matrix.
Is it even the right way to do it, by calculate on the sub matrixs and then add them together when done?
The normal way to multiply matrices is with a triple loop, as shown in this answer. It should be fairly trivial to convert the outer loop to a parallel.For loop, since all calculations are independent from each other. This avoids any need to process individual sub matrices, and let the framework deal with partitioning the work.
However, things like this is typically fairly cache sensitive. A matrix will be stored in memory as sequential values, either row or column major. Accessing sequential values will be very cache friendly, but accessing non sequential values will not be. So you might want to copy a full row/column to a temporary array to ensure all subsequent accesses are sequential. If using a parallel loop you should probably use one of the overloads that give you a thread local array to use. There more things one can do with cache optimizations, and SIMD intrinstics but that is probably best left as a later exercise.
There are algorithms with a lower algorithmic complexity that does work on submatrices, but in my experience it will be fairly tricky to make this actually faster in c# than a cache-optimized triple loop.
Keep in mind to measure the performance of your method. I would also suggest comparing your performance with some well optimized existing library to get some sense of how performant your implementation is.