Home > Mobile >  Is there a way to replace the words in a vector by numbers from a specific source
Is there a way to replace the words in a vector by numbers from a specific source

Time:04-29

I have been struggling with replacing the words in the Comments variable (data1) by specific numbers. For example, whenever there is a word "truck", it is going to be replaced by "1". whenever there is a word "trailer", it is going to be replaced by "2"... and so on. I could finally get the following two datasets. The first one is my comments and the second one is all the important words with numbers. All what I need now it to replace every word in the comments from data1 with the correspondent number from data2.

data1<- structure(list(Direction = c("W", "W", "E"), Comments = list(
    "tractor trailer struck by car that left scene // delayed response due to previous assists // ma unit called off incident", 
    "incident cleared before ma unit arrived on scene", "crash on union south of i-70 // 3 city tow on scene. // no state damage  // all cleared from roadway ")), row.names = 3:5, class = "data.frame")


data2<- structure(list(Content = c("tractor", "trailer", "struck", "car", 
"left", "scene", "delayed", "response", "due", "previous", "assists", 
"unit", "called", "incident", "cleared", "arrived", "crash", 
"union", "south", "i-70", "city", "tow", "state", "damage", "roadway"
), number = 1:25), row.names = c(NA, 25L), class = "data.frame")

My ultimate goal is to look at comments and see numbers instead or words. Thanks for the help in advance.

CodePudding user response:

Here is a way with gsub.

for(i in seq_len(nrow(data2))) {
  pat <- paste0("\\<", data2$Content[i], "\\>")
  data1$Comments <- gsub(pat, data2$number[i], data1$Comments) 
}
data1
#>   Direction                                                           Comments
#> 3         W           1 2 3 by 4 that 5 6 // 7 8 9 to 10 11 // ma 12 13 off 14
#> 4         W                                         14 15 before ma 12 16 on 6
#> 5         E 17 on 18 19 of 20 // 3 21 22 on 6. // no 23 24  // all 15 from 25

Created on 2022-04-28 by the reprex package (v2.0.1)

  • Related