Home > Software design >  What is the best way to tidy data and extract numbers (double) from data frame in R?
What is the best way to tidy data and extract numbers (double) from data frame in R?

Time:10-16

I have a data frame that contains survey responses. What is the best way to extract the numbers and change them to a double type variable?

Here is a little sample:

a <- ("10.5", "about 30", "25 per month")
tibble(a)

I have tried

parse_double(a)

and it seems like I am close. Any help is appreciated

CodePudding user response:

We need parse_number

library(readr)
parse_number(a)
[1] 10.5 30.0 25.0

The difference is that parse_double works on character vectors with only digits . as the characters whereas parse_number extracts the numeric part from a string which also include non-numeric characters

data

a <- c("10.5", "about 30", "25 per month")

CodePudding user response:

I know a solution too (from base package)

a <- c("10.5", "about 30", "25 per month")
as.numeric(gsub("[[:alpha:]]", "", a)) 


 > as.numeric(gsub("[[:alpha:]]", "", a))
 [1] 10.5 30.0 25.0
 > end_time <- Sys.time()
 > end_time - start_time
 Time difference of 0.01400113 secs
 > start_time <- Sys.time()
 > parse_number(a)
 [1] 10.5 30.0 25.0
 > end_time <- Sys.time()
 > end_time - start_time
 Time difference of 0.1500092 secs

Akrun, my solution is faster ;)))

  • Related