Home > Back-end >  Select column based on indicator in another column
Select column based on indicator in another column

Time:01-31

I have a dataframe in R where I have a column with a numeric indicator, and I want to create a new column picking from other columns in the dataframe based on that indicator. Probably easiest if I explain via an example...

data <- data.frame(
    id = 1:6,
    selction = c(1,4,2,3,2,8),
    column1 = 1:6,
    column2 = 11:16,
    column3 = 21:26,
    column4 = 31:36,
    column5 = 41:46,
    column6 = 51:56,
    column7 = 61:66,
    column8 = 71:76
)
> data
  id selction column1 column2 column3 column4 column5 column6 column7 column8
1  1        1       1      11      21      31      41      51      61      71
2  2        4       2      12      22      32      42      52      62      72
3  3        2       3      13      23      33      43      53      63      73
4  4        3       4      14      24      34      44      54      64      74
5  5        2       5      15      25      35      45      55      65      75
6  6        8       6      16      26      36      46      56      66      76

I would like to create a new column chosing from the columnx columns based on the number in the indicator, i.e. end up with

> data
  id selction column1 column2 column3 column4 column5 column6 column7 column8 newcolumn
1  1        1       1      11      21      31      41      51      61      71         1
2  2        4       2      12      22      32      42      52      62      72        32
3  3        2       3      13      23      33      43      53      63      73        13
4  4        3       4      14      24      34      44      54      64      74        24
5  5        2       5      15      25      35      45      55      65      75        15
6  6        8       6      16      26      36      46      56      66      76        76

I am trying to avoid doing it with a long ifelse as in reality I could have a lot more than 8 columns. Any help would be appreciated

CodePudding user response:

The "[" function will accept a multi-column numeric matrix or array with the same number of columns as there are dimensions in the data object. For dataframes this is always a two column matrix. Subset with row and column index:

data[, -c(1:2) ][ cbind(seq(nrow(data)), data$selction) ]
# [1]  1 32 13 24 15 76

Here cbind gives me rows, 1 to nrows, and data$selection as the column index, but before subsetting, I am excluding id and selection columns - -c(1:2).

Could also have added 2 to make a proper offset without the exclusion (Thanks @IRTFM).

data[ cbind(seq(nrow(data)), data$selction   2) ]
# [1]  1 32 13 24 15 76

CodePudding user response:

With diag:

library(dplyr)
data %>% 
  mutate(newcolumn = diag(as.matrix(cur_data()[-c(1, 2)][selection])))

 id selection column1 column2 column3 column4 column5 column6 column7 column8 newcolumn
1  1         1       1      11      21      31      41      51      61      71         1
2  2         4       2      12      22      32      42      52      62      72        32
3  3         2       3      13      23      33      43      53      63      73        13
4  4         3       4      14      24      34      44      54      64      74        24
5  5         2       5      15      25      35      45      55      65      75        15
6  6         8       6      16      26      36      46      56      66      76        76

Or with map:

library(purrr)
data %>% 
  mutate(newcolumn = map(seq_along(selection), 
                         ~ select(cur_data(), starts_with("column"))[[selection[.x]]][.x]))

CodePudding user response:

Create a function that gets the data and the column which has the selection. Remember to add the 2 to skip the first to columns Id and Selction or make a sub selection first. Then append the result.

data <- data.frame(
    id = 1:6,
    selction = c(1,4,2,3,2,8),
    column1 = 1:6,
    column2 = 11:16,
    column3 = 21:26,
    column4 = 31:36,
    column5 = 41:46,
    column6 = 51:56,
    column7 = 61:66,
    column8 = 71:76
)
data
  id selction column1 column2 column3 column4 column5 column6 column7 column8
1  1        1       1      11      21      31      41      51      61      71
2  2        4       2      12      22      32      42      52      62      72
3  3        2       3      13      23      33      43      53      63      73
4  4        3       4      14      24      34      44      54      64      74
5  5        2       5      15      25      35      45      55      65      75
6  6        8       6      16      26      36      46      56      66      76

selColVal <- function(data,colData){
new_col <- list()
for(i in 1:nrow(data)){
new_col<- append(new_col,data[i,data[i,"selction"] 2])
   }
return(new_col)
}

a=selColVal(data,data["selction"])
data$new_column <- a

data
       id selction column1 column2 column3 column4 column5 column6 column7 column8 new_column
    1  1        1       1      11      21      31      41      51      61      71       1
    2  2        4       2      12      22      32      42      52      62      72       32
    3  3        2       3      13      23      33      43      53      63      73       13
    4  4        3       4      14      24      34      44      54      64      74       24
    5  5        2       5      15      25      35      45      55      65      75       15
    6  6        8       6      16      26      36      46      56      66      76       76
  • Related