I am new to R. so i need help with this transformation. I have two tables:
x1 <- c(7, 4, 4, 9, 2, 5, 8)
x2 <- c(5, 2, 8, 9, 1, 3, 2)
x3 <- c(6, 2, 3, 4, 2, 2, 7)
objid <- c(1, 2, 3, 4, 4, 2, 4)
data_1 <- data.frame(objid, x1, x2, x3)
and a second table:
x1_r <- c(1.54,0.23, 1.32, 11.66)
x2_r <- c(1.14,1.23, 9.32, 1.26)
x3_r <- c(1.58,0.23, 7.32, 7.66)
objid <- c(1, 2, 3, 4)
data_2 <- data.frame(objid, x1_r, x2_r, x3_r)
What am trying to do in R is do a CASE by group "objid" something like this:
CASE WHEN [x1] <=[x1_r] THEN 1 ELSE 0
CASE WHEN [x2] <=[x2_r] THEN 1 ELSE 0
CASE WHEN [x3] <=[x3_r] THEN 1 ELSE 0
And generate new columns with results in data_2:
objid | x1 | x2 | x3 | x1_r_fin | x2_r_fin | x3_r_fin |
---|---|---|---|---|---|---|
1 | 7 | 5 | 6 | 0 | 0 | 0 |
2 | 4 | 2 | 2 | 0 | 0 | 0 |
3 | 4 | 8 | 3 | 0 | 0 | 1 |
4 | 9 | 9 | 4 | 1 | 0 | 1 |
4 | 2 | 1 | 2 | 1 | 1 | 1 |
2 | 5 | 3 | 2 | 0 | 0 | 0 |
4 | 8 | 2 | 7 | 1 | 0 | 1 |
In mutate i have applied this method:
df %>% mutate_at(vars(-matches("objid")), list(Dif = ~ . - x1))
For simple subtraction and generated new columns with new name on df. I want to do the same with the aforementioned above, but have no clue how, or is there a better and more efficient method. Thanks for your help!
CodePudding user response:
Here is an approach using dplyr case_when:
library(tidyverse)
data_1 %>% inner_join(data_2, by='objid') %>% # join data_1 and data_2 by objid
mutate(x1r_fin = case_when(x1 <= x1_r ~ 1,
TRUE~ 0 )) %>%
mutate(x2r_fin = case_when(x2 <= x2_r ~ 1,
TRUE~ 0 )) %>%
mutate(x3r_fin = case_when(x3 <= x3_r ~ 1,
TRUE~ 0 )) %>%
select(-c(x1_r, x2_r, x3_r))
For univariate conditions, ifelse()
is also quite readable:
merge(data_1, data_2, by='objid') -> data_1
data_1$x1_r_fin <- ifelse(data_1$x1 <= data_1$x1_r, 1, 0)
data_1$x2_r_fin <- ifelse(data_1$x2 <= data_1$x2_r, 1, 0)
data_1$x3_r_fin <- ifelse(data_1$x3 <= data_1$x3_r, 1, 0)
data_1$x1_r <- NULL
data_1$x2_r <- NULL
data_1$x3_r <- NULL
You can also use ifelse with mutate() a la:
... %>% mutate(x1r_fin = ifelse(x1 <= x1_r, 1, 0)) %>% ...
CodePudding user response:
you can use ifelse function from base R. there are different numbers of raws in two datasets that's why I merged them at first.
new_data = merge(data_1, data_2, all = T)
then with using ifelse function, you can create new variables
new_data$new_variable1 = ifelse(new_data$x1 < new_data$x1_r, 1, 0)
also, you can add a string in ifelse function
new_data$new_variable2 = ifelse(new_data$x1 < new_data$x1_r, "group2", 0)
then if you want to extract variables from new_data
new_data = new_data %>% select(-5,-6,7)
or like @NovaEthos mentioned in comments, you can use case_when function
new_data$new_variable1 = case_when(new_data$x1 < new_data$x1_r ~ "group 1")
however, after that, you should eliminate NA's.