I want to compare two character values in R and see which characters where added and deleted to display it later similar to git diff --color-words=.
(see screenshot below)
For example:
a <- "hello world"
b <- "helo world!"
diff <- FUN(a, b)
where diff
would somehow show that an l
was dropped and a !
was added.
The ultimate goal is to construct an html string like this hel<span >l</span>o world<span >!</span>
.
I am aware of diffobj
but so far I cannot get it to return the character differences, only the differences between elements.
Output of git diff --color-words=.
CodePudding user response:
CodePudding user response:
Found a solution using diffobj::ses_dat()
and splitting the data into its characters before.
get_html_diff <- function(a, b) {
aa <- strsplit(a, "")[[1]]
bb <- strsplit(b, "")[[1]]
s <- diffobj::ses_dat(aa, bb)
m <- cumsum(as.integer(s$op) != c(Inf, s$op[1:(length(s$op) - 1)]))
res <- paste(
sapply(split(seq_along(s$op), m), function(i) {
val <- paste(s$val[i], collapse = "")
if (s$op[i[[1]]] == "Insert")
val <- paste0("<span class=\"add\">", val, "</span>")
if (s$op[i[[1]]] == "Delete")
val <- paste0("<span class=\"del\">", val, "</span>")
val
}),
collapse = "")
res
}
get_html_diff("hello world", "helo World!")
#> [1] "hel<span class=\"del\">l</span>o <span class=\"del\">w</span><span class=\"add\">W</span>orld<span class=\"add\">!</span>"
Created on 2022-05-31 by the reprex package (v2.0.1)