Home > Mobile >  Replace string based on partial match - tidyverse
Replace string based on partial match - tidyverse

Time:10-29

I have a column that are characters. There are observations where ph was added to the end of the string. Any string in the column that has ph in the string I want replaced with "NA". Here is what I have tried and gives the below error.

Column

wind_speed_m_s <- c("1.7", "0.7", "0", "0.6", "0.4", "1.2", "1.9", "1.3", "2.0, gust to 3.7", 
"0.5", "1.8", "1.4", "3.4", "2.8", "1.6", "2", NA, "0.9", "0.8", 
"1", "1.1", "2.6", "2.4", "1.1ph", "1.7 kt", "2.1", "1.5", "0ph", 
"3", ".4 /s", "0.3", "2.3", "0.2", "3.3", "3.9ph", "1.5ph", "1ph", 
"2ph", "1.7ph", "0.8 ph", "1.5 ph", "2.2", "1.9 k/hr", "2.5", 
"NA", "0.4/s", "1/s")
date <- data_raw %>%
  mutate(wind_speed_m_s = str_replace(wind_speed_m_s, pattern = str_detect("ph"), "NA"))

Error in `mutate()`:
! Problem while computing `wind_speed_m_s = str_replace(wind_speed_m_s, pattern = str_detect("ph"), "NA")`.
Caused by error in `type()`:
! argument "pattern" is missing, with no default
Backtrace:
  1. ... %>% ...
  9. stringr::str_detect("ph")
 10. stringr:::type(pattern)

CodePudding user response:

You can use sub:

sub(".*ph$", NA, wind_speed_m_s)

 [1] "1.7"              "0.7"              "0"               
 [4] "0.6"              "0.4"              "1.2"             
 [7] "1.9"              "1.3"              "2.0, gust to 3.7"
[10] "0.5"              "1.8"              "1.4"             
[13] "3.4"              "2.8"              "1.6"             
[16] "2"                NA                 "0.9"             
[19] "0.8"              "1"                "1.1"             
[22] "2.6"              "2.4"              NA                
[25] "1.7 kt"           "2.1"              "1.5"             
[28] NA                 "3"                ".4 /s"           
[31] "0.3"              "2.3"              "0.2"             
[34] "3.3"              NA                 NA                
[37] NA                 NA                 NA                
[40] NA                 NA                 "2.2"             
[43] "1.9 k/hr"         "2.5"              "NA"              
[46] "0.4/s"            "1/s"            

Also can do:

is.na(wind_speed_m_s) <- grepl("ph$", wind_speed_m_s)

Note that $ is needed to indicate the end of the string incase there is another ph in the middle of a string. If you need anything that has ph regardless as to whether its at the end or the middle, just remove the $

CodePudding user response:

We can use grep to identify where the patter is and use it as index for replacement.

> wind_speed_m_s[grep("ph", wind_speed_m_s)] <- NA
> wind_speed_m_s
 [1] "1.7"              "0.7"              "0"                "0.6"              "0.4"              "1.2"             
 [7] "1.9"              "1.3"              "2.0, gust to 3.7" "0.5"              "1.8"              "1.4"             
[13] "3.4"              "2.8"              "1.6"              "2"                NA                 "0.9"             
[19] "0.8"              "1"                "1.1"              "2.6"              "2.4"              NA                
[25] "1.7 kt"           "2.1"              "1.5"              NA                 "3"                ".4 /s"           
[31] "0.3"              "2.3"              "0.2"              "3.3"              NA                 NA                
[37] NA                 NA                 NA                 NA                 NA                 "2.2"             
[43] "1.9 k/hr"         "2.5"              "NA"               "0.4/s"            "1/s"             

CodePudding user response:

We may use str_detect within case_when. In the OP's code, it had only a single argument i.e. pattern and without the data

library(dplyr)
library(stringr)
 data_raw %>%
   mutate(wind_speed_m_s = case_when(str_detect(wind_speed_m_s, "ph", 
      negate = TRUE)~ wind_speed_m_s))

-output

     wind_speed_m_s
1               1.7
2               0.7
3                 0
4               0.6
5               0.4
6               1.2
7               1.9
8               1.3
9  2.0, gust to 3.7
10              0.5
11              1.8
12              1.4
13              3.4
14              2.8
15              1.6
16                2
17             <NA>
18              0.9
19              0.8
20                1
21              1.1
22              2.6
23              2.4
24             <NA>
25           1.7 kt
26              2.1
27              1.5
28             <NA>
29                3
30            .4 /s
31              0.3
32              2.3
33              0.2
34              3.3
35             <NA>
36             <NA>
37             <NA>
38             <NA>
39             <NA>
40             <NA>
41             <NA>
42              2.2
43         1.9 k/hr
44              2.5
45               NA
46            0.4/s
47              1/s
  • Related