I have a dataframe in R where I have a column with a numeric indicator, and I want to create a new column picking from other columns in the dataframe based on that indicator. Probably easiest if I explain via an example...
data <- data.frame(
id = 1:6,
selction = c(1,4,2,3,2,8),
column1 = 1:6,
column2 = 11:16,
column3 = 21:26,
column4 = 31:36,
column5 = 41:46,
column6 = 51:56,
column7 = 61:66,
column8 = 71:76
)
> data
id selction column1 column2 column3 column4 column5 column6 column7 column8
1 1 1 1 11 21 31 41 51 61 71
2 2 4 2 12 22 32 42 52 62 72
3 3 2 3 13 23 33 43 53 63 73
4 4 3 4 14 24 34 44 54 64 74
5 5 2 5 15 25 35 45 55 65 75
6 6 8 6 16 26 36 46 56 66 76
I would like to create a new column chosing from the columnx columns based on the number in the indicator, i.e. end up with
> data
id selction column1 column2 column3 column4 column5 column6 column7 column8 newcolumn
1 1 1 1 11 21 31 41 51 61 71 1
2 2 4 2 12 22 32 42 52 62 72 32
3 3 2 3 13 23 33 43 53 63 73 13
4 4 3 4 14 24 34 44 54 64 74 24
5 5 2 5 15 25 35 45 55 65 75 15
6 6 8 6 16 26 36 46 56 66 76 76
I am trying to avoid doing it with a long ifelse as in reality I could have a lot more than 8 columns. Any help would be appreciated
CodePudding user response:
The "["
function will accept a multi-column numeric matrix or array with the same number of columns as there are dimensions in the data object. For dataframes this is always a two column matrix. Subset with row and column index:
data[, -c(1:2) ][ cbind(seq(nrow(data)), data$selction) ]
# [1] 1 32 13 24 15 76
Here cbind gives me rows, 1 to nrows, and data$selection as the column index, but before subsetting, I am excluding id and selection columns - -c(1:2)
.
Could also have added 2 to make a proper offset without the exclusion (Thanks @IRTFM).
data[ cbind(seq(nrow(data)), data$selction 2) ]
# [1] 1 32 13 24 15 76
CodePudding user response:
With diag
:
library(dplyr)
data %>%
mutate(newcolumn = diag(as.matrix(cur_data()[-c(1, 2)][selection])))
id selection column1 column2 column3 column4 column5 column6 column7 column8 newcolumn
1 1 1 1 11 21 31 41 51 61 71 1
2 2 4 2 12 22 32 42 52 62 72 32
3 3 2 3 13 23 33 43 53 63 73 13
4 4 3 4 14 24 34 44 54 64 74 24
5 5 2 5 15 25 35 45 55 65 75 15
6 6 8 6 16 26 36 46 56 66 76 76
Or with map
:
library(purrr)
data %>%
mutate(newcolumn = map(seq_along(selection),
~ select(cur_data(), starts_with("column"))[[selection[.x]]][.x]))
CodePudding user response:
Create a function that gets the data and the column which has the selection. Remember to add the 2 to skip the first to columns Id and Selction or make a sub selection first. Then append the result.
data <- data.frame(
id = 1:6,
selction = c(1,4,2,3,2,8),
column1 = 1:6,
column2 = 11:16,
column3 = 21:26,
column4 = 31:36,
column5 = 41:46,
column6 = 51:56,
column7 = 61:66,
column8 = 71:76
)
data
id selction column1 column2 column3 column4 column5 column6 column7 column8
1 1 1 1 11 21 31 41 51 61 71
2 2 4 2 12 22 32 42 52 62 72
3 3 2 3 13 23 33 43 53 63 73
4 4 3 4 14 24 34 44 54 64 74
5 5 2 5 15 25 35 45 55 65 75
6 6 8 6 16 26 36 46 56 66 76
selColVal <- function(data,colData){
new_col <- list()
for(i in 1:nrow(data)){
new_col<- append(new_col,data[i,data[i,"selction"] 2])
}
return(new_col)
}
a=selColVal(data,data["selction"])
data$new_column <- a
data
id selction column1 column2 column3 column4 column5 column6 column7 column8 new_column
1 1 1 1 11 21 31 41 51 61 71 1
2 2 4 2 12 22 32 42 52 62 72 32
3 3 2 3 13 23 33 43 53 63 73 13
4 4 3 4 14 24 34 44 54 64 74 24
5 5 2 5 15 25 35 45 55 65 75 15
6 6 8 6 16 26 36 46 56 66 76 76