I have a dataframe that looks like this:
df1 <- data.frame(
Group = c('A', 'B', 'Total: AB', 'C', 'D', 'Total: CD'),
Value1 = c(12, 88, 100, 76, 23, 99),
Value2 = c(10, 82, 92, 71, 21, 92)
)
And another that looks like this:
df2 <- data.frame(
Group = c('A', 'B', 'Total: AB', 'C', 'D', 'Total: CD'),
Value1 = c(435, 568, 1003, 709, 183, 727),
Value2 = c(180, 370, 550, 199, 283, 482)
)
I want to replace the Total
rows in df1
with their values in df2
for a result like this:
df_desired <- data.frame(
Group = c('A', 'B', 'Total: AB', 'C', 'D', 'Total: CD'),
Value1 = c(12, 88, 1003, 76, 23, 727),
Value2 = c(10, 82, 550, 71, 21, 482)
)
What's the best way to do this while preserving row order using R, ideally dplyr
?
CodePudding user response:
If the datasets are in order and have the same dimensions, find the locations of 'Total' substring from the 'Group' column in 'df1' and use that as row index to replace the values other than the first column in 'df1' with the corresponding values from 'df2'
df1 <- type.convert(df1, as.is = TRUE)
df2 <- type.convert(df2, as.is = TRUE)
i1 <- grep("Total", df1$Group)
df1[i1, -1] <- df2[i1, -1]
-checking
> identical(df1, df_desired)
[1] TRUE
Or using dplyr
library(dplyr)
library(stringr)
df3 <- df1 %>%
mutate(across(starts_with("Value"), ~
case_when(str_detect(Group, 'Total') ~ df2[[cur_column()]],
TRUE ~ .x)))
-output
df3
Group Value1 Value2
1 A 12 10
2 B 88 82
3 Total: AB 1003 550
4 C 76 71
5 D 23 21
6 Total: CD 727 482