Replacing characters in an array with numbers-CodePudding

I have read in data from an excel file, and made it into a vector. I then made it into a 3d array.

The vector that forms the array, and also inside the array now has in it characters, like this:

D <- c('g', 't', NA, 'd')
nPeriods = 3
column.names = c('aaa', 'bbb')
row.names = c('jjj', 'hhh')
threeD.names = c(1:nPeriods)
E = array(c(D), dim=c(2, 2, nPeriods),
          dimnames = list(row.names, column.names, threeD.names))

However, now I want to assign 'g', 't', etc as variables with starting values, e.g.

g = 5
t = 2
d = 7

But the only way I know how to do this is by manually figuring out which element of the array it is and assigning it like that, e.g.

E[1,1,]=5

When it's a large matrix that's going to be difficult and annoying to find the corresponding position every time I refer to an element. I know that each element of the vector (that is not NA) is unique, so I wondered, is there a shortcut for referring to each element of the array? (Maybe something in the apply family? but there are so many of them).

I will also need to refer later to them to loop over nPeriods. Yesterday someone showed me I can do this with:

for (i in 2:nPeriods){
  C[1,1,i]=C[1,1,i-1]*2
}

But in real life my matrix might be quite large so I'd rather just be able to refer with d, t, etc.

CodePudding user response：

If you store your replacement values in a named vector, you can then use apply to replace elements by name. apply handles the looping automatically, and even converts the result to a numeric array if all non-NA elements are numeric. Note that the 1:3 in the apply call refers to the three dimensions of your array, not the length of the third dimension, so this should work no matter how large the array is.

values <- c(g = 5,
            t = 2,
            d = 7)

num_array <- apply(E, 1:3, function(x) values[x])

num_array

, , 1

    aaa bbb
jjj   5  NA
hhh   2   7

, , 2

    aaa bbb
jjj   5  NA
hhh   2   7

, , 3

    aaa bbb
jjj   5  NA
hhh   2   7

Your second question is unclear, but you can which to get the elements efficiently. You only need to loop over the slices:

result <- num_array
for (i in 2:dim(num_array)[3]) {
  idx <- which(E[, , 1] == 'g', arr.ind = T)
  row <- idx[1, 'row']
  col <- idx[1, 'col']
  result[row, col, i] <- result[row, col, i-1] * 2
}

, , 1

    aaa bbb
jjj   5  NA
hhh   2   7

, , 2

    aaa bbb
jjj  10  NA
hhh   2   7

, , 3

    aaa bbb
jjj  20  NA
hhh   2   7

If you wanted to store operations per each character (relatively simple operations), you could take advantage of some of R's meta-programming features:

funcs <- c(g = '*', t = ' ', d = '-')
modifiers <- c(g = 2, t = 3, d = 4)

num_array <- apply(E, 1:3, function(x) values[x])

result <- num_array
for (i in 2:dim(num_array)[3]) {
  for (j in names(values)) {
    idx <- which(E[, , 1] == j, arr.ind = T)
    row <- idx[1, 'row']
    col <- idx[1, 'col']
    result[row, col, i] <- do.call(funcs[j], args = list(result[row, col, i-1], modifiers[j]))
  }
}

, , 1

    aaa bbb
jjj   5  NA
hhh   2   7

, , 2

    aaa bbb
jjj  10  NA
hhh   5   3

, , 3

    aaa bbb
jjj  20  NA
hhh   8  -1

CodePudding user response：

This task is more manageable in a list. You can use the following:

# Example data
D <- c('g', 't', NA, 'd')
nPeriods = 3
column.names = c('aaa', 'bbb')
row.names = c('jjj', 'hhh')
threeD.names = c(1:nPeriods)
E = array(c(D), dim=c(2, 2, nPeriods),
          dimnames = list(row.names, column.names, threeD.names))

# Convert array to list
array_list <- lapply(seq(dim(E)[3]), function(x) E[,,x])

# Re-assign values
array_converted <- lapply(array_list, function(x){
  x <- ifelse(x == "g", 5, x) # your conversion values
  x <- ifelse(x == "t", 2, x)
  x <- ifelse(x == "d", 7, x)
  x <- apply(x, 2, as.numeric) # Ensures values are numeric
  return(x)
})

# For final format as array (if you want)
final_array <- simplify2array(array_converted)