I want to subtract from two different dataframes. I have a data frame with information in it, and another that has a "reference point".
I have one data frame, DF1, that has this information.
Name | Key | BCD12A | ACA1 | DEX2 |
---|---|---|---|---|
X_1_group1 | test1234 | 10 | 10 | 8 |
X_2_group1 | test4553 | 8 | 7 | 4 |
X_2_group2 | test3341 | 5 | 5 | 5 |
X_2_group1 | test2142 | 5 | 6 | 8 |
X_1_group2 | test4722 | 6 | 7 | 4 |
Using DF1, I want to subtract the values in a specific row to the row of DF2 (below), but based on a numerical match from the first column.
For example, based on DF2 below, I have a string in DF2$Name as X_1_group1. In DF1, the first row has that same string in the first column. Since the name matches, I will subtract the values in that row of DF1 from that row in DF2. Note that there may be more than one string in DF1 with the same name, so I want to be able to match the names that way.
Name | Key | BCD12A | ACA1 | DEX2 |
---|---|---|---|---|
X_1_group1 | X_1_group1_A | 2 | 2 | 1 |
X_2_group1 | X_1_group1_B | 1 | 3 | 1 |
X_1_group2 | X_1_group1_C | 2 | 2 | 3 |
X_2_group2 | X_1_group1_D | 2 | 1 | 2 |
DF3 (Which is the goal) - The second column, through different will stay the same that matches DF1.
Name | Key | BCD12A | ACA1 | DEX2 |
---|---|---|---|---|
X_1_group1 | test1234 | 8 | 8 | 7 |
X_2_group1 | test4553 | 7 | 4 | 3 |
X_2_group2 | test3341 | 3 | 4 | 3 |
X_2_group1 | test2142 | 4 | 3 | 7 |
X_1_group2 | test4722 | 4 | 5 | 1 |
Any help on name matching with two data frames for a subtraction would be much appreciated. I also want to generalize this that there will be a lot more columns than in this example, so anything that would consider that portion would help a lot too! Thanks for any help!
# DF1
Name <- c("X_1_group1", "X_2_group1", "X_2_group2", "X_2_group1", "X1_group2")
Key <- c("test1234", "test4553", "test3341", "test2142", "test4722")
BCD12A <- c(10, 8, 5, 5, 6)
ACA1 <- c(10, 7, 5, 6, 7)
DEX2 <- c(8, 4, 5, 8, 4)
DF1 <- data.frame(Name, Key, BCD12A, ACA1, DEX2)
# DF2
Name <- c("X_1_group1", "X_2_group1", "X_1_group2", "X_2_group2")
Key <- c("X_1_group1_A", "X_1_group1_B", "X_1_group1_C", "X_1_group1_D")
BCD12A <- c(2, 1, 2, 2)
ACA1 <- c(2, 3, 2, 1)
DEX2 <- c(1, 1, 3, 2)
DF2 <- data.frame(Name, Key, BCD12A, ACA1, DEX2)
CodePudding user response:
We may use a join by 'Key' and subtract the corresponding numeric columns. With powerjoin
, we have the option to get the difference (-
) where there are conflict
ing names (common names)
library(powerjoin)
library(dplyr)
DF1 %>%
power_left_join(DF2 %>% select(-Key), by = "Name", conflict = `-`)
-output
Name Key BCD12A ACA1 DEX2
1 X_1_group1 test1234 8 8 7
2 X_2_group1 test4553 7 4 3
3 X_2_group2 test3341 3 4 3
4 X_2_group1 test2142 4 3 7
5 X_1_group2 test4722 4 5 1
Or using data.table
join
library(data.table)
nm1 <- names(DF1)[3:5]
nm2 <- paste0("i.", nm1)
setDT(DF1)[DF2, (nm1) := Map(`-`, mget(nm1), mget(nm2)), on = .(Name)]