I have been struggling with replacing the words in the Comments variable (data1) by specific numbers. For example, whenever there is a word "truck", it is going to be replaced by "1". whenever there is a word "trailer", it is going to be replaced by "2"... and so on. I could finally get the following two datasets. The first one is my comments and the second one is all the important words with numbers. All what I need now it to replace every word in the comments from data1 with the correspondent number from data2.
data1<- structure(list(Direction = c("W", "W", "E"), Comments = list(
"tractor trailer struck by car that left scene // delayed response due to previous assists // ma unit called off incident",
"incident cleared before ma unit arrived on scene", "crash on union south of i-70 // 3 city tow on scene. // no state damage // all cleared from roadway ")), row.names = 3:5, class = "data.frame")
data2<- structure(list(Content = c("tractor", "trailer", "struck", "car",
"left", "scene", "delayed", "response", "due", "previous", "assists",
"unit", "called", "incident", "cleared", "arrived", "crash",
"union", "south", "i-70", "city", "tow", "state", "damage", "roadway"
), number = 1:25), row.names = c(NA, 25L), class = "data.frame")
My ultimate goal is to look at comments and see numbers instead or words. Thanks for the help in advance.
CodePudding user response:
Here is a way with gsub
.
for(i in seq_len(nrow(data2))) {
pat <- paste0("\\<", data2$Content[i], "\\>")
data1$Comments <- gsub(pat, data2$number[i], data1$Comments)
}
data1
#> Direction Comments
#> 3 W 1 2 3 by 4 that 5 6 // 7 8 9 to 10 11 // ma 12 13 off 14
#> 4 W 14 15 before ma 12 16 on 6
#> 5 E 17 on 18 19 of 20 // 3 21 22 on 6. // no 23 24 // all 15 from 25
Created on 2022-04-28 by the reprex package (v2.0.1)