Home > front end >  Removing Dash with any number and at any position in R
Removing Dash with any number and at any position in R

Time:03-09

I have data frame like this :

df_out <- data.frame(
  "name" = c("1", "2", "3", "4", "5", "6", "7", "8"),
  "col1"=rep("BA-",times= 8),
  "col2"=rep("-BA",times= 8),
  "col3"=rep("G-G",times= 8),
  "col4"=rep("-",times= 8),
  "col5"=rep("--",times= 8),
  "col6"=rep("---",times= 8))
df_out

I want to replace any dash at any position with any number by na. I'm using na-if function. So the output should be like this :

 df_out
  name col1 col2 col3 col4 col5 col6
1    1   NA   NA   NA   NA   NA   NA
2    2   NA   NA   NA   NA   NA   NA
3    3   NA   NA   NA   NA   NA   NA
4    4   NA   NA   NA   NA   NA   NA
5    5   NA   NA   NA   NA   NA   NA
6    6   NA   NA   NA   NA   NA   NA
7    7   NA   NA   NA   NA   NA   NA
8    8   NA   NA   NA   NA   NA   NA

Can you please help me? Thank you

CodePudding user response:

Another way:

df_out %>% mutate(across(.fns = ~ gsub("-", NA, .x)))

  #    name col1 col2 col3 col4 col5 col6
  # 1    1 <NA> <NA> <NA> <NA> <NA> <NA>
  # 2    2 <NA> <NA> <NA> <NA> <NA> <NA>
  # 3    3 <NA> <NA> <NA> <NA> <NA> <NA>
  # 4    4 <NA> <NA> <NA> <NA> <NA> <NA>
  # 5    5 <NA> <NA> <NA> <NA> <NA> <NA>
  # 6    6 <NA> <NA> <NA> <NA> <NA> <NA>
  # 7    7 <NA> <NA> <NA> <NA> <NA> <NA>
  # 8    8 <NA> <NA> <NA> <NA> <NA> <NA>

CodePudding user response:

With base R, we can find the location of any - by using grepl, then subset to the locations that are TRUE, then assign NA.

df_out[sapply(df_out, \(x) grepl("-", x))] <- NA

Another option using a combination of str_detect and replace:

library(tidyverse)

df_out %>%
  mutate(across(everything(), ~ replace(., str_detect(., "[-]"), NA)))

Output

  name col1 col2 col3 col4 col5 col6
1    1 <NA> <NA> <NA> <NA> <NA> <NA>
2    2 <NA> <NA> <NA> <NA> <NA> <NA>
3    3 <NA> <NA> <NA> <NA> <NA> <NA>
4    4 <NA> <NA> <NA> <NA> <NA> <NA>
5    5 <NA> <NA> <NA> <NA> <NA> <NA>
6    6 <NA> <NA> <NA> <NA> <NA> <NA>
7    7 <NA> <NA> <NA> <NA> <NA> <NA>
8    8 <NA> <NA> <NA> <NA> <NA> <NA>

CodePudding user response:

If you are working with tidyverse, this should work. It checks each column for the presence of dashes, and replaces the cell with NA if any are present.

df_out %>% 
  mutate(across(everything(), ~ifelse(grepl('- ', .x), NA, .x)))

Or in base R, you can do something similar with lapply. Just remember to select the columns you want to modify carefully, and rejoin them to the remaining columns. In this case we are applying the function to every column, so we just have to convert the result of lapply back to a data frame.

as.data.frame(lapply(df_out, \(x) ifelse(grepl('- ', x), NA, x)))      
  •  Tags:  
  • r
  • Related