I have seen some simple conversions from C code to MIPS assembly code and worked with some basic instructions. But how should we convert the following C code, which has a jagged array
and two-dimensional array
to assembly?
int[][] A;
int[,] B;
A[B[0, 1]][A[1][2]] B[B[j, 1], i]
I have searched on the internet, but unfortunately, I could not find a good source or tutorial. I know how should we work with a one-dimensional array
and find the memory address base on the base address and use offset. But I do not know how to solve this question and write its MIPS code just on paper. I will be grateful for any help.
CodePudding user response:
These aren't C code, this looks like C#.
int[][] A; // C# declaration for jagged multidimensional array
int[,] B; // C# declaration for rectangular multidimensional array
First, let's look at the simpler and more regular rectangular array.
C code for rectangular multidimensional array would be:
int B[5][10]; // C declaration for rectangular array
This fragment request a total of 50 integer elements, structed as 5 rows, where the row size is 10 elements.
Because the logical data structure is two dimensional yet physical memory is only one dimensional, we must map the 2D structure onto 1D. There are two ways of doing this, one is row-major and the other column-major. If we consider the index having the 5 to be row and 10 the column, then we can say that the C language uses row-major ordering.
(That there are two choices in mapping the array is similar to the idea that there are two reasonable orderings for individual bytes in multi-byte (i.e. word) items: big endian and little endian.)
The array is immediate in some sense: there are no pointers or references stored in this structure, only the 50 integer elements.
Access to an element is by address computation:
In B[r][c]
we compute addr = B (r * 10 c) * 4
, where the 10 represents the size of the row, and the 4 is the size of one integer element. The multiplications are what are called scaling. Because each row refers to 10 elements, then row 1 starts 10 positions further down than row 0. Because each element takes 4 bytes, and each of the 4 bytes takes a memory address, then column 1 starts 4 bytes further down than column 0. In summary, element 0,0
is at B (0*10 0)*4
, which is the same address value as B
, and element 1,2
is at B (1*10 2)*4
, which is B 48
— the 3rd element ( 8) of the 2nd row ( 10*4).
That addr
can be dereferenced to load the r,c
element from the array, or to store an updated value at r,c
.
(If we were to chose column-major ordering, then the formula would be addr = B (c * 5 r) * 4
. Row major ordering is better if the column index varies faster, i.e. if we use for ( r = .. ) for ( c = .. ) { }
, while for column-major we would prefer the inverse, varying the row index faster — the index for an inner loop varies faster than the index for an outer loop. The difference between them has to do with cache utilization (without a cache it wouldn't matter much).)
Jagged arrays are arrays of arrays. Each array involved is one-dimensional, so while there is still a rectangular shape to each individual array, the rectangular shape does not apply to the two dimensional structure of the overall "2D" array.
In C a jagged array is declared as an array of pointers, where the pointers themselves are references to individual arrays of zero or more integer elements (or are null).
int *A[10];
This declares an array (the [] take precedence over the *, so this is taken as int *(A[10])
, the type of each array element is pointer to int.
The total storage size of this array is 40 bytes, 10 * size of pointer, 4, on MIPS 32-bit system. Each element of the array is a pointer type. There are no integers reserved by this array declaration! There are positions for 10 pointers to integer (arrays), and each of 10 for each position must be further specified, by an initializer or by allocation. Because the elements of this array are pointers to integer, each such pointer can be null or refer to storage of a different number of integers. If you wanted to know how many integers each one (row) had/has, you will have to store that information separately, because there is no built in mechanism to store that with this declaration.
In summary, jagged arrays in C have the following issues:
- Only storage for the pointer array (array of pointer) is allocated by this declaration.
- The storage (the pointer elements) will be initialized to zero, for a global declaration, and also when using
calloc
, but usingmalloc
or as a local variable declaration, those pointers will be uninitialized (i.e. garbage). - Even when we populate the individual pointers with storage, there is no inherent recording of the length of the storage (the count of how many integer) are referred to by each element position. The array can be jagged, but you would have to remember that shape in some other way (i.e. in other data structure, like a simple array of integer of the same element count as the pointer array.)
Accessing an element of a jagged array means obtaining the pointer element from the first array, and using that to compute the address of an integer element. Both of these operations involve one-dimensional arrays, so in at least that sense, is conceptually simpler: there is no issue of row vs. column major ordering.
An element's address, A[r][c]
is computed as follows:
addr = (*(A r * sizeof(pointer))) c * sizeof(int)
where the first * is indirection that fetches the pointer at index r
. This addr
can be used the same as in the description after addr
for rectangular array above.
The MIPS implementation of either of these is then fairly straightforward. For global storage, reserve the appropriate amount. In code, compute element addresses and use lw
or sw
once you've computed an addr
.