Home > Enterprise >  How to select data from column named by other column in R
How to select data from column named by other column in R

Time:06-28

I have a dataframe in R where one column (say column A) names other columns. For example,

colA     colB     colC     colD
'colB'   1        2        3
'colD'   4        5        6

I want to create a new column where I extract the value in the column named by column A. For example, in the above table, I would like to create a new column whose value is 1 in the first row and 6 in the second row. What is the best way to automate the construction of such a column? I would especially appreciate if there is some way to do this within the tidyverse, but would be open to other solutions if those are considered better.

CodePudding user response:

We can use match to get the column index from 'colA' with the column names of the dataset, create a matrix with cbinding the row index and extract based on row/column index

df1$new <- df1[-1][cbind(seq_len(nrow(df1)), match(df1$colA, names(df1)[-1]))]

-output

> df1
  colA colB colC colD new
1 colB    1    2    3   1
2 colD    4    5    6   6

This should be more efficient as we are using indexing in a vectorized way


Using dplyr, we can rowwise and extract the column value with [[

library(dplyr)
df1 %>% 
 rowwise %>% 
 mutate(new = cur_data()[[colA]]) %>% 
 ungroup
# A tibble: 2 × 5
  colA   colB  colC  colD   new
  <chr> <int> <int> <int> <int>
1 colB      1     2     3     1
2 colD      4     5     6     6

Or create the row/column indexing as in the base R solution

df1 %>%
   mutate(new = .[-1][cbind(row_number(), match(colA, names(df1)[-1]))])
  colA colB colC colD new
1 colB    1    2    3   1
2 colD    4    5    6   6

data

df1 <- structure(list(colA = c("colB", "colD"), colB = c(1L, 4L), colC = c(2L, 
5L), colD = c(3L, 6L)), class = "data.frame", row.names = c(NA, 
-2L))

CodePudding user response:

You could also do:

rownames(df1) <- seq(nrow(df1))
transform(df1, new = df1[-1][cbind(rownames(df1), colA)])

  colA colB colC colD new
1 colB    1    2    3   1
2 colD    4    5    6   6

Also:

df1 %>%left_join(pivot_longer(., -colA) %>%
     filter(name == colA) %>%
     select(-name))

  colA colB colC colD value
1 colB    1    2    3     1
2 colD    4    5    6     6

CodePudding user response:

A simple one-line solution from Base R:

df1$value <- df1[outer(df1$colA, colnames(df1), FUN = "==")]
df1
#>   colA colB colC colD value
#> 1 colB    1    2    3     1
#> 2 colD    4    5    6     6

The idea is to use outer to create an indexing matrix that filters the values you need.

  • Related