Given a string that describes a hyper-matrix such as
"[[[1,2],[4,5],[6,7]]" which is a 3,2 matrix
or
"[[1,2,3],[4,5,6]]" which is a 2,3 matrix
or
"[ [[1,2],[3,4],[4,6]], [[7,8],[9,10],[11,12]] ]" which is a 2,3,2 matrix
etc
is there a nice way to find the dimensions of the matrix? eg I'm after an algorithm that would report 2,3,2 for the third matrix. I can guarantee that each dimension has non-varying elements, ie there are no missing numbers. One can assume that the string can be tokenized into integers, commas, and left and right square brackets. I'm more interested in what approach one might use, I assume it is likely to be recursive. I toyed with the idea of building a tree but I'm not sure that helps in the end. Obviously, I can do this with python very easily using numpy and looking at shape, but I'm after suggestions for possible algorithms.
CodePudding user response:
Here is a C version:
void printMatrixDimensions(const char *inString) {
int depth = 0;
int dimensions = 0;
int *counts = NULL;
for (const char *inString_ptr = inString; inString_ptr[0] != '\0'; inString_ptr ) {
char c = inString_ptr[0];
char upOrDown = 0;
if (c==' ' || c=='\t' || c=='\r' || c=='\n') continue; // skip whitespace
if (c == '[') { upOrDown = 1; } // determine if we're going down in depth
else if (c == ']') { upOrDown = -1; } // or coming up
depth = upOrDown;
if (upOrDown == 1) { // if we're going deeper,
if (depth > dimensions) { // expand our depths stack if needed
dimensions = depth;
counts = realloc(counts, sizeof(int) * dimensions);
}
counts[depth - 1] = 0; // initialize the new depth with a count of 0
}
if (!(c=='[' || c==']')) { // if a character other than []
if (c==',') { // if a comma, then we have at least 2 members
if (counts[depth - 1] == 0) counts[depth - 1] = 2;
else counts[depth - 1] = 1;
} else { // if a non-comma char is there, that counts for a member
if (counts[depth - 1] == 0) counts[depth - 1] = 1;
}
}
}
printf("Matrix dimensions:\n");
for (int i = 0; i < dimensions; i ) {
printf("%i ", counts[i]);
}
printf("\n");
if (counts != NULL) free(counts);
}
CodePudding user response:
- Iterate through the string and maintain a variable called depth, initialised to 0, incremented by 1 at every '[' and decremented by 1 at every ']'.
- While iterating through the string, count the number of commas at every depth. Keep a list or dict of counters so that you have one counter per depth.
- Stop counting the commas at a given depth whenever variable depth becomes smaller than that depth again for the first time.
def get_dims(mat):
depth = 0
capped_depth = len(mat)
dims = {}
for c in mat:
if c == '[':
depth = 1
elif c == ']':
if depth <= capped_depth:
dims[depth] = dims.get(depth, 0) 1
depth -= 1
capped_depth = min(depth, capped_depth)
elif c == ',' and depth <= capped_depth:
dims[depth] = dims.get(depth, 0) 1
return [v for k,v in sorted(dims.items())]
get_dims("[ [[1,2],[3,4],[4,6]], [[7,8],[9,10],[11,12]] ]")
# [2, 3, 2]
get_dims("[[1,2,3],[4,5,6]]")
# [2, 3]
get_dims("[[1,2],[4,5],[6,7]]")
# [3, 2]
get_dims('[[[]]]')
# [1, 1, 1]
get_dims('[[[,]]]')
# [1, 1, 2]
Note that this code assumes that the string depicts a well-formed matrix. But you can easily modify the code to verify that the string is well-formed.
- The brackets are correctly nested if and only if
depth
is always positive during the for-loop, and becomes exactly 0 after the for-loop. - The code that I wrote uses the length of the first list at every depth to compute the corresponding dimension. The matrix will be well-formed if all the lists at every given depth have the same length.