I have example data as follows:
library(fuzzyjoin)
a <- data.frame(x = c("season", "season", "season", "package", "package"), y = c("1","2", "3", "1","6"))
b <- data.frame(x = c("season", "seson", "seson", "package", "pakkage"), w = c("1","2", "3", "2","6"))
c <- data.frame(z = c("season", "seson", "seson", "package", "pakkage"), w = c("1","2", "3", "2","6"))
So the following runs fine:
d <- stringdist_left_join(a,b, by = "x", max_dist = 2)
But merging with a column with a different name is not allowed (note that the join is now a
and c
).
e <- stringdist_left_join(a,c, by = c("x", "z"), max_dist = 2)
I would like to tell stringdist_left_join
to use two different column names to join by, like the last line of code it (e)
, but it does not seems to accept it.
Is there any solution to this (other than copying the column and giving it another name)?
CodePudding user response:
You can use =
for two different column names. You can use the following code:
e <- stringdist_left_join(a,c, by = c("x" = "z"), max_dist = 2)
Output:
x y z w
1 season 1 season 1
2 season 1 seson 2
3 season 1 seson 3
4 season 2 season 1
5 season 2 seson 2
6 season 2 seson 3
7 season 3 season 1
8 season 3 seson 2
9 season 3 seson 3
10 package 1 package 2
11 package 1 pakkage 6
12 package 6 package 2
13 package 6 pakkage 6