Let's say I had a data frame and I wanted to replace NA's with 0's only in columns that contained a specific string in the column name.
Let's call the dataframe df, the columns are numeric type, and the string "keyword".
i.e., a column name would be, "Column1keyword", etc.
How can I do this?
I've tried this and it didn't work:
df %>%
mutate(across(where(~ colnames.is.numeric(.x) && 'keyword' %in% colnames.x), replace_na, 0))
CodePudding user response:
Update
If it is to select columns having 'keyword' as substring in the column names, use contains
to select across
those columns
library(dplyr)
library(tidyr)
df1 <- df1 %>%
mutate(across(contains('keyword'), replace_na, 0))
-output
df1
# A tibble: 5 × 4
col1 col2_keyword col3 col4
<int> <chr> <chr> <dbl>
1 1 a a 1
2 2 b b 3
3 3 0 c NA
4 4 c d 5
5 5 d <NA> 6
Assuming that the OP mentioned to replace
NA
only in columns that have a specific element 'keyword', use where
with a logical expression to select the columns that have the 'keyword', loop across
those columns and use replace_na
to replace the NA to 0
df <- df %>%
mutate(across(where(~ is.character(.x) && 'keyword' %in% .x), replace_na, 0))
-output
df
# A tibble: 5 × 4
col1 col2 col3 col4
<int> <chr> <chr> <dbl>
1 1 a a 1
2 2 b b 3
3 3 keyword c NA
4 4 0 d 5
5 5 c <NA> 6
data
df <- tibble(col1 = 1:5, col2 = c("a", "b", "keyword", NA, 'c'),
col3 = c('a', 'b', 'c', 'd', NA), col4 = c(1, 3, NA, 5, 6))
df1 <- tibble(col1 = 1:5, col2_keyword = c("a", "b", NA, 'c', 'd'),
col3 =c('a', 'b', 'c', 'd', NA), col4 = c(1, 3, NA, 5, 6))