Home > database >  Conversion char variable in R to numeric
Conversion char variable in R to numeric

Time:02-03

I have small data set with one column in char format. Below you can see data.

 test<-structure(list(txtVALUE = c("<5", "<5", "8", "<5", "9", "12", 
                                         "45", "5", "<5", "<5", "11,478", "117", "1,526", "1,642", "3,920", 
                                         "98", "8", "<5", "<5", "<5", "<5")), row.names = c(NA, -21L), class = c("tbl_df", 
                                                                                                                 "tbl", "data.frame"))
  

Now I want to convert this data from chr format in numeric.I tried with this command below

      test$txtVALUE<-as.numeric(test$txtVALUE)
Warning message:
NAs introduced by coercion 

But this command does not convert data as I expected. Namely, numbers such as "1,526", "1,642", and "3,920" are converted in NAN values, although they are numbers.

So can anybody help me how to convert this data from char to numeric in the proper way without NaN for numbers?

CodePudding user response:

Your data appears to be counts so I have taken a slight liberty of assuming that it's always whole numbers. If it is not do not use this approach as it will delete decimal points as well.

However, if it is, as you want "<5" to be NA, you can use gsub() to replace all values that contain "<" with a blank string, and also delete anything which is not a number (e.g. commas in "11,478").

Of course gsub() produces a character vector so wrap this in as.integer().

as.integer(gsub("\\D|<. ", "", test$txtVALUE))
#  [1]    NA    NA     8    NA     9    12    45     5    NA    NA 11478   117  1526  1642  3920    
# [16]    98     8    NA    NA    NA    NA
  •  Tags:  
  • r
  • Related