Home > OS >  R output row and column index of dataframe with target values in a vector
R output row and column index of dataframe with target values in a vector

Time:02-26

Here I have a dataframe df:

structure(list(col1 = c("f", "h", "k", "p", "d", "n", "o", "i", 
"s", "a"), col2 = c("e", "d", "m", "g", "r", "h", "k", "p", "t", 
"i"), col3 = c("s", "e", "l", "c", "t", "f", "a", "p", "k", "d"
)), class = "data.frame", row.names = c(NA, -10L))
col1 col2 col3
f e s
h d e
k m l
p g c
d r t
n h f
o k a
i p p
s t k
a i d

And a vector vec containing the values that I would like to search in df:

vec <- c("a", "h", "ah")

I would like to output the column and row index of the element in the vector, with the first column being the element, the second column being the row index and the third column being the column index, like the following (the order of rows are not important), also, it should also take care of values not present in df:

name row col
h    2   1
h    6   2
a    7   3
a    10  1
ah   NA  NA

I tried doing it with which or arrayInd, but it misses the final record of "a" (note the final row in the first column in df), also it throws an error at values (e.g. "ah") not present in df.

arrayInd(which(df == vec), dim(df), useNames = T)

which(df == vec, arr.ind = T)

     row col
[1,]   2   1
[2,]   6   2
[3,]   7   3

Also, I need to effectively add a column of "a" or "h" to the output. Note that I do not know the number of elements in vec, it can be 2 or 10 or more, I also don't know how many columns are in df, but their column names have the pattern col{n}.

Any ideas will be appreciated, thanks!

CodePudding user response:

Here's a base R solution.

You can use arr.ind inside which. You would need to do this for each element in vec, so need to wrap it in an lapply and rbind the results:

do.call(rbind, lapply(vec, function(x) {
  y <- which(df == x, arr.ind = TRUE)
  if(length(y) == 0) y <- cbind(row = NA, col = NA)
  data.frame(name = x, y)}))
#>   name row col
#> 1    a  10   1
#> 2    a   7   3
#> 3    h   2   1
#> 4    h   6   2
#> 5    z  NA  NA

CodePudding user response:

library(purrr)

setNames(vec, vec) |>
  map(~{df == .x}) |>
  map(~which(.x, arr.ind = T)) |>
  map(~if(nrow(.x) == 0) rbind(.x, NA) else .x) |>
  map_dfr(data.frame, .id = "name")

#> name row col
1    a  10   1
2    a   7   3
3    h   2   1
4    h   6   2
5   ah  NA  NA


  • Related