Based on the code and data below how can I get a print
of number of rows (count
) with matching
values in two dataframes
?
The number of matching records between [insert dataframe 1 name] and [insert dataframe 2 name] are X based on [insert matching column name].
I know I can look at the display in the console to do this, but I guess printing the above message might also be a good idea especially when there are big datasets involved. For this I might have to create a function, and my function creating skills are not so polished at the moment.
Code data:
library(tidyverse)
# Dummy data
df1 = data.frame(v1 = c(1,2,3,4,5,6,7,8),
v2 = c("A","E","C","B","B","C","A","E"))
df2 = data.frame(v2 = c("D","E","A","C","D","B"),
v3 = c("d","e","a","c","d","b"))
# Match values
df_new = df1 %>%
mutate(v2= as.character(v2)) %>%
left_join(df2)
# Write code to print the number of matching records, stuck!!!
# The number of matching records between [insert dataframe 1 name] and [insert dataframe 2 name] are X based on [insert matching column name].
CodePudding user response:
I like to use the tidylog
-package for this.
This handy package wraps most dplyr
functions including *_join
's and offers print-outs for several of these (also filter
, distinct
, mutate
, etc.). However, because it's a wrapper it also hides the help and (in RStudio) autocomplete, why I seldom loads the package with library(tidylog)
, instead I call it with tidylog::
:
library(dplyr)
df_new <- df1 |>
mutate(v2 = as.character(v2)) |>
tidylog::left_join(df2)
Output:
Joining, by = "v2"
left_join: added one column (v3)
> rows only in x 0
> rows only in y (2)
> matched rows 8
> ===
> rows total 8