I have a column that are characters. There are observations where ph was added to the end of the string. Any string in the column that has ph in the string I want replaced with "NA". Here is what I have tried and gives the below error.
Column
wind_speed_m_s <- c("1.7", "0.7", "0", "0.6", "0.4", "1.2", "1.9", "1.3", "2.0, gust to 3.7",
"0.5", "1.8", "1.4", "3.4", "2.8", "1.6", "2", NA, "0.9", "0.8",
"1", "1.1", "2.6", "2.4", "1.1ph", "1.7 kt", "2.1", "1.5", "0ph",
"3", ".4 /s", "0.3", "2.3", "0.2", "3.3", "3.9ph", "1.5ph", "1ph",
"2ph", "1.7ph", "0.8 ph", "1.5 ph", "2.2", "1.9 k/hr", "2.5",
"NA", "0.4/s", "1/s")
date <- data_raw %>%
mutate(wind_speed_m_s = str_replace(wind_speed_m_s, pattern = str_detect("ph"), "NA"))
Error in `mutate()`:
! Problem while computing `wind_speed_m_s = str_replace(wind_speed_m_s, pattern = str_detect("ph"), "NA")`.
Caused by error in `type()`:
! argument "pattern" is missing, with no default
Backtrace:
1. ... %>% ...
9. stringr::str_detect("ph")
10. stringr:::type(pattern)
CodePudding user response:
You can use sub
:
sub(".*ph$", NA, wind_speed_m_s)
[1] "1.7" "0.7" "0"
[4] "0.6" "0.4" "1.2"
[7] "1.9" "1.3" "2.0, gust to 3.7"
[10] "0.5" "1.8" "1.4"
[13] "3.4" "2.8" "1.6"
[16] "2" NA "0.9"
[19] "0.8" "1" "1.1"
[22] "2.6" "2.4" NA
[25] "1.7 kt" "2.1" "1.5"
[28] NA "3" ".4 /s"
[31] "0.3" "2.3" "0.2"
[34] "3.3" NA NA
[37] NA NA NA
[40] NA NA "2.2"
[43] "1.9 k/hr" "2.5" "NA"
[46] "0.4/s" "1/s"
Also can do:
is.na(wind_speed_m_s) <- grepl("ph$", wind_speed_m_s)
Note that $
is needed to indicate the end of the string incase there is another ph
in the middle of a string. If you need anything that has ph
regardless as to whether its at the end or the middle, just remove the $
CodePudding user response:
We can use grep
to identify where the patter is and use it as index for replacement.
> wind_speed_m_s[grep("ph", wind_speed_m_s)] <- NA
> wind_speed_m_s
[1] "1.7" "0.7" "0" "0.6" "0.4" "1.2"
[7] "1.9" "1.3" "2.0, gust to 3.7" "0.5" "1.8" "1.4"
[13] "3.4" "2.8" "1.6" "2" NA "0.9"
[19] "0.8" "1" "1.1" "2.6" "2.4" NA
[25] "1.7 kt" "2.1" "1.5" NA "3" ".4 /s"
[31] "0.3" "2.3" "0.2" "3.3" NA NA
[37] NA NA NA NA NA "2.2"
[43] "1.9 k/hr" "2.5" "NA" "0.4/s" "1/s"
CodePudding user response:
We may use str_detect
within case_when
. In the OP's code, it had only a single argument i.e. pattern and without the data
library(dplyr)
library(stringr)
data_raw %>%
mutate(wind_speed_m_s = case_when(str_detect(wind_speed_m_s, "ph",
negate = TRUE)~ wind_speed_m_s))
-output
wind_speed_m_s
1 1.7
2 0.7
3 0
4 0.6
5 0.4
6 1.2
7 1.9
8 1.3
9 2.0, gust to 3.7
10 0.5
11 1.8
12 1.4
13 3.4
14 2.8
15 1.6
16 2
17 <NA>
18 0.9
19 0.8
20 1
21 1.1
22 2.6
23 2.4
24 <NA>
25 1.7 kt
26 2.1
27 1.5
28 <NA>
29 3
30 .4 /s
31 0.3
32 2.3
33 0.2
34 3.3
35 <NA>
36 <NA>
37 <NA>
38 <NA>
39 <NA>
40 <NA>
41 <NA>
42 2.2
43 1.9 k/hr
44 2.5
45 NA
46 0.4/s
47 1/s