Home > Blockchain >  Selecting data frame value basing on the vectors
Selecting data frame value basing on the vectors

Time:06-15

How can I select values of a data frame given two vectors storing the row and column? For mtcars

                   mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Duster 360        14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 240D         24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Merc 230          22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
...

I tried

data.frame(row = c(1, 3, 5),
                 col = c('mpg', 'cyl', 'disp')) |>
 mutate(value = mtcars[row, col])

and expected to get

  row  col value
1   1  mpg  21.0
2   3  cyl     4
3   5 disp 360.0

instead of

  row  col value.mpg value.cyl value.disp
1   1  mpg      21.0         6        160
2   3  cyl      22.8         4        108
3   5 disp      18.7         8        360

CodePudding user response:

matrix indexing to the rescue. See ?Extract:

A third form of indexing is via a numeric matrix with the one column for each dimension: each row of the index matrix then selects a single element of the array, and the result is a vector.

...

Indexing via a character matrix with one column per dimensions is also supported if the array has dimension names.

So:

df <- data.frame(row = c(1, 3, 5),
                 col = c('mpg', 'cyl', 'disp'))
## character matrix indexing
cbind(df, value = mtcars[cbind(rownames(mtcars)[df$row], df$col)])
## numeric matrix indexing
cbind(df, value = mtcars[cbind(df$row, match(df$col, colnames(mtcars)))])

## both resulting in:

##  row  col value
##1   1  mpg    21
##2   3  cyl     4
##3   5 disp   360

As @Onyambu notes, you can also use transform/base R piping if you like:

df |> transform(value = mtcars[cbind(rownames(mtcars)[row], col)])

...or the dplyr piping with mutate:

df %>% mutate(value = mtcars[cbind(rownames(mtcars)[row], col)])

CodePudding user response:

apply family in base R

One way to do it is to use sapply to go through all rows in your query data frame, and use the row and col info of each row to index the mtcars dataset.

EDIT: Credit to @Darren Tsai for providing the mapply solution.

library(dplyr)

data.frame(row = c(1, 3, 5),
                 col = c('mpg', 'cyl', 'disp')) %>% 
  mutate(value = sapply(1:nrow(.), \(x) mtcars[.[x, 1], .[x, 2]]))
  # or mapply
  mutate(value = mapply(\(x, y) mtcars[x, y], row, col))

rowwise() from dplyr

Another way of doing it is to use rowwise():

library(dplyr)

data.frame(row = c(1, 3, 5),
           col = c('mpg', 'cyl', 'disp')) %>% 
  rowwise() %>% 
  mutate(value = mtcars[row, col])

Output

    row col   value
1     1 mpg      21
2     3 cyl       4
3     5 disp    360

CodePudding user response:

Here is another tidyverse option:

library(tidyverse)

data.frame(row = c(1, 3, 5),
           col = c('mpg', 'cyl', 'disp')) %>%
  mutate(value = map2_dbl(row, col, ~ mtcars[.x, .y]))

Output

  row  col value
1   1  mpg    21
2   3  cyl     4
3   5 disp   360
  • Related