Home > Enterprise >  Replacing all non-numeric characters in certain columns in R
Replacing all non-numeric characters in certain columns in R

Time:07-15

How can I remove all non-numeric characters from all columns expect "a"?

Simulated data

library(tidyverse)

d = tibble(a = c("Tom", "Mary", "Ben", "Jane", "Lucas", "Mark"),
           b = c("8P", "3", "6", "7", "5M", "U1"),
           c = c("2", "12", "6F", "7F", "Y1", "9I"))

d

enter image description here

Expected output should look as follows

enter image description here

Tidyverse solutions are especially appreciated!

CodePudding user response:

You could use across (within mutate) to do it over all columns but a and use regex (within str_extract) to extract only numerics (and convert to numerics type).

library(tidyverse)

d |> 
  mutate(across(-a, ~ . |> str_extract("\\d ") |> as.numeric()))

Output:

# A tibble: 6 × 3
  a         b     c
  <chr> <dbl> <dbl>
1 Tom       8     2
2 Mary      3    12
3 Ben       6     6
4 Jane      7     7
5 Lucas     5     1
6 Mark      1     9

CodePudding user response:

Another option would be to use parse_number:

d %>%
  mutate(across(c("b", "c"), readr::parse_number))

# # A tibble: 6 x 3
# a         b     c
# <chr> <dbl> <dbl>
# 1 Tom       8     2
# 2 Mary      3    12
# 3 Ben       6     6
# 4 Jane      7     7
# 5 Lucas     5     1
# 6 Mark      1     9
  • Related