I am mainly interested in replacing a specific value (81) in many columns across the dataframe.
For example, if this is my dataset
Id Date Col_01 Col_02 Col_03 Col_04
30 2012-03-31 1 A42.2 20.46 43
36 1996-11-15 42 V73 23 55
96 2010-02-07 X48 81 13 3R
40 2010-03-18 AD14 18.12 20.12 36
69 2012-02-21 8 22.45 12 10
11 2013-07-03 81 V017 78.12 81
22 2001-06-01 11 09 55 12
83 2005-03-16 80.45 V22.15 46.52 X29.11
92 2012-02-12 1 4 67 12
34 2014-03-10 82.12 N72.22 V45.44 10
I like to replace value 81 in columns Col1, Col2, Col3, Col4
to NA. The final expected dataset like this
Id Date Col_01 Col_02 Col_03 Col_04
30 2012-03-31 1 A42.2 20.46 43
36 1996-11-15 42 V73 23 55
96 2010-02-07 X48 **NA 13 3R
40 2010-03-18 AD14 18.12 20.12 36
69 2012-02-21 8 22.45 12 10
11 2013-07-03 **NA V017 78.12 **NA
22 2001-06-01 11 09 55 12
83 2005-03-16 80.45 V22.15 46.52 X29.11
92 2012-02-12 1 4 67 12
34 2014-03-10 82.12 N72.22 V45.44 10
I tried this approach
df %>% select(matches("^Col_\\d $"))[ df %>% select(matches("^Col_\\d $")) == 81 ] <- NA
Something similar to this solution data[ , 2:3 ][ data[ , 2:3 ] == 4 ] <- 10
here
Replacing occurrences of a number in multiple columns of data frame with another value in R
This did not work.
Any suggestion is much appreciated. Thanks in adavance.
CodePudding user response:
Instead of select
, we can directly specify the matches
in mutate
to replace the values that are '81' to NA
(use na_if
)
library(dplyr)
df <- df %>%
mutate(across(matches("^Col_\\d $"), ~ na_if(., "81")))
-output
df
Id Date Col_01 Col_02 Col_03 Col_04
1 30 2012-03-31 1 A42.2 20.46 43
2 36 1996-11-15 42 V73 23 55
3 96 2010-02-07 X48 <NA> 13 3R
4 40 2010-03-18 AD14 18.12 20.12 36
5 69 2012-02-21 8 22.45 12 10
6 11 2013-07-03 <NA> V017 78.12 <NA>
7 22 2001-06-01 11 09 55 12
8 83 2005-03-16 80.45 V22.15 46.52 X29.11
9 92 2012-02-12 1 4 67 12
10 34 2014-03-10 82.12 N72.22 V45.44 10
Or we can use base R
i1 <- grep("^Col_\\d $", names(df))
df[i1][df[i1] == "81"] <- NA
The issue in the OP's code is the assignment is not triggered as we expect i.e.
(df %>%
select(matches("^Col_\\d $")))[(df %>%
select(matches("^Col_\\d $"))) == "81" ]
[1] "81" "81" "81"
which is same as
df[i1][df[i1] == "81"]
[1] "81" "81" "81"
and not the assignment
(df %>%
select(matches("^Col_\\d $")))[(df %>%
select(matches("^Col_\\d $"))) == "81" ] <- NA
Error in (df %>% select(matches("^Col_\\d $")))[(df %>% select(matches("^Col_\\d $"))) == :
could not find function "(<-"
In base R
, it does the assignment with [<-
data
df <- structure(list(Id = c(30L, 36L, 96L, 40L, 69L, 11L, 22L, 83L,
92L, 34L), Date = c("2012-03-31", "1996-11-15", "2010-02-07",
"2010-03-18", "2012-02-21", "2013-07-03", "2001-06-01", "2005-03-16",
"2012-02-12", "2014-03-10"), Col_01 = c("1", "42", "X48", "AD14",
"8", "81", "11", "80.45", "1", "82.12"), Col_02 = c("A42.2",
"V73", "81", "18.12", "22.45", "V017", "09", "V22.15", "4", "N72.22"
), Col_03 = c("20.46", "23", "13", "20.12", "12", "78.12", "55",
"46.52", "67", "V45.44"), Col_04 = c("43", "55", "3R", "36",
"10", "81", "12", "X29.11", "12", "10")),
class = "data.frame", row.names = c(NA,
-10L))
CodePudding user response:
We can also use replace
:
library(dplyr)
df <- df %>%
mutate(across(matches("^Col_\\d $"), ~ replace(.x, ~.x==81, NA)))