Home > Blockchain >  How to turn multiple columns into numeric for specific columns only using a loop in R?
How to turn multiple columns into numeric for specific columns only using a loop in R?

Time:01-03

Beginner here: I have a dataframe with multiple columns that are currently strings that contain a $-sign and spaces and I want to turn them into numeric. My dataframe looks like this:

Name  Col_x_1    Company  Col_x_2  Start_Year  End_Year  Col_x_3
asd   $841 392   Test     $31 000  1902        1933      0
kfj   0          Test_2   0        1933        1954      $10 000
ale   $200 000   Test_3   0        1988        1999      0
...

I am currently using the following code to loop this through for the columns named Col_x_ as they are all named the same in ascending order:

library(tidyverse)

df %>% 
  mutate(across(starts_with("Col_x_"), ~gsub("\\$", "", .) %>% 
                  as.numeric())
         )

however, this only gives me NAs as the as.numeric() does not work. Does anyone know how I can fix this code? Thank you in advance!

CodePudding user response:

library(tidyverse)

df %>%
  mutate(across(starts_with("Col_x_"), ~ str_remove_all(.x, "[^0-9]"))) %>%
  type_convert()

# A tibble: 3 × 7
  Name  Col_x_1 Company Col_x_2 Start_Year End_Year Col_x_3
  <chr>   <dbl> <chr>     <dbl>      <dbl>    <dbl>   <dbl>
1 asd    841392 Test      31000       1902     1933       0
2 kfj         0 Test_2        0       1933     1954   10000
3 ale    200000 Test_3        0       1988     1999       0

CodePudding user response:

In addition to the solutions in the comments, you could also use the convenience functions of {readr}, e. g.:

library(readr)

my_locale <- locale(grouping_mark = " ")

effect:

> parse_number("$12 235", locale = my_locale)
[1] 12235
  • Related