Home > Net >  Algorithm to determine dimensions of matrix given a string rep
Algorithm to determine dimensions of matrix given a string rep

Time:10-13

Given a string that describes a hyper-matrix such as

"[[[1,2],[4,5],[6,7]]"  which is a 3,2 matrix

or

 "[[1,2,3],[4,5,6]]"    which is a 2,3 matrix

or

 "[ [[1,2],[3,4],[4,6]], [[7,8],[9,10],[11,12]] ]" which is a 2,3,2 matrix

etc

is there a nice way to find the dimensions of the matrix? eg I'm after an algorithm that would report 2,3,2 for the third matrix. I can guarantee that each dimension has non-varying elements, ie there are no missing numbers. One can assume that the string can be tokenized into integers, commas, and left and right square brackets. I'm more interested in what approach one might use, I assume it is likely to be recursive. I toyed with the idea of building a tree but I'm not sure that helps in the end. Obviously, I can do this with python very easily using numpy and looking at shape, but I'm after suggestions for possible algorithms.

CodePudding user response:

Here is a C version:

void printMatrixDimensions(const char *inString) {
    int depth = 0;
    int dimensions = 0;
    int *counts = NULL;
    for (const char *inString_ptr = inString; inString_ptr[0] != '\0'; inString_ptr  ) {
        char c = inString_ptr[0];
        char upOrDown = 0;
        
        if (c==' ' || c=='\t' || c=='\r' || c=='\n') continue; // skip whitespace
        if (c == '[') { upOrDown = 1; } // determine if we're going down in depth
        else if (c == ']') { upOrDown = -1; } // or coming up
        depth  = upOrDown;
        
        if (upOrDown == 1) { // if we're going deeper,
            if (depth > dimensions) { // expand our depths stack if needed
                dimensions = depth;
                counts = realloc(counts, sizeof(int) * dimensions);
            }
            counts[depth - 1] = 0; // initialize the new depth with a count of 0
        }
        
        if (!(c=='[' || c==']')) { // if a character other than []
            if (c==',') { // if a comma, then we have at least 2 members
                if (counts[depth - 1] == 0) counts[depth - 1] = 2;
                else counts[depth - 1]  = 1;
            } else { // if a non-comma char is there, that counts for a member
                if (counts[depth - 1] == 0) counts[depth - 1] = 1;
            }
        }
    }
    
    printf("Matrix dimensions:\n");
    for (int i = 0; i < dimensions; i  ) {
        printf("%i ", counts[i]);
    }
    printf("\n");
    
    
    if (counts != NULL) free(counts);

}

CodePudding user response:

  • Iterate through the string and maintain a variable called depth, initialised to 0, incremented by 1 at every '[' and decremented by 1 at every ']'.
  • While iterating through the string, count the number of commas at every depth. Keep a list or dict of counters so that you have one counter per depth.
  • Stop counting the commas at a given depth whenever variable depth becomes smaller than that depth again for the first time.
def get_dims(mat):
  depth = 0
  capped_depth = len(mat)
  dims = {}
  for c in mat:
    if c == '[':
      depth  = 1
    elif c == ']':
      if depth <= capped_depth:
        dims[depth] = dims.get(depth, 0)   1
      depth -= 1
      capped_depth = min(depth, capped_depth)
    elif c == ',' and depth <= capped_depth:
      dims[depth] = dims.get(depth, 0)   1
  return [v for k,v in sorted(dims.items())]

get_dims("[ [[1,2],[3,4],[4,6]], [[7,8],[9,10],[11,12]] ]")
# [2, 3, 2]

get_dims("[[1,2,3],[4,5,6]]")
# [2, 3]

get_dims("[[1,2],[4,5],[6,7]]")
# [3, 2]

get_dims('[[[]]]')
# [1, 1, 1]

get_dims('[[[,]]]')
# [1, 1, 2]

Note that this code assumes that the string depicts a well-formed matrix. But you can easily modify the code to verify that the string is well-formed.

  • The brackets are correctly nested if and only if depth is always positive during the for-loop, and becomes exactly 0 after the for-loop.
  • The code that I wrote uses the length of the first list at every depth to compute the corresponding dimension. The matrix will be well-formed if all the lists at every given depth have the same length.
  • Related