R mapply function row-wise for matching columns of two dataframes-CodePudding

I have two dataframes with the same column structure (just for simplification, as my dfs hold 40k observations).

df1 <- data.frame(name = c("mark", "peter", "lily"),
                  column1= c(1,2,3), 
                  column2 = c(4,5,6))
df2 <- data.frame(name = c("mark", "liam", "peter"), 
                  column1 = c(7,8,9), 
                  column2 = c(1,2,3))

What do you think would be the most elegant way to apply a function row-wise to matching columns of two dataframes but only their matching rownames?

In this example: (df1$column1 df1$column2)/df2$column2 should only be applied to the respective rows "mark" and "peter" in both dfs.

CodePudding user response：

Maybe something like this...

library(dplyr)

df_combined<-df1%>%
  left_join(df2, by="name") %>%
  mutate(combined=(column1.x   column2.x)/column2.y) %>%
  filter(!is.na(combined)) %>%
  select(name,combined)

df1 %>% left_join(df_combined)
#>    name column1 column2 combined
#> 1  mark       1       4 5.000000
#> 2 peter       2       5 2.333333
#> 3  lily       3       6       NA

df2 %>% left_join(df_combined)
#>    name column1 column2 combined
#> 1  mark       7       1 5.000000
#> 2  liam       8       2       NA
#> 3 peter       9       3 2.333333

^{Created on 2022-01-02 by the reprex package (v2.0.1)}

CodePudding user response：

A base R solution. First merge the data frames, then select the resulting columns with grep.

df_new <- merge(df1, df2, "name")

df_new
   name column1.x column2.x column1.y column2.y
1  mark         1         4         7         1
2 peter         2         5         9         3

df_new$calc <- apply(df_new[,-1], 1, function(a) 
  sum(a[grep("\\.x",names(a))]) / a[grep("2\\.y",names(a))])

df_new
   name column1.x column2.x column1.y column2.y     calc
1  mark         1         4         7         1 5.000000
2 peter         2         5         9         3 2.333333