Home > Software engineering >  Change a char value in a data column into zero?
Change a char value in a data column into zero?

Time:12-06

I have a simple problem in that I have a very long data frame which reports 0 as a char "nothing" in the data frame column. How would I replace all of these to a numeric 0. A sample data frame is below

Group Candy
A 5
B nothing

And this is what I want to change it into

Group Candy
A 5
B 0

Keeping in mind my actual dataset is 100s of rows long.

My own attempt was to use is.na but apparently it only works for NA and can convert those into zeros with ease but wasn't sure if there's a solution for actual character datatypes.

Thanks

CodePudding user response:

The best way is to read the data in right, not with "nothing" for missing values. This can be done with argument na.strings of functions read.table or read.csv. Then change the NA's to zero.

The following function is probably slow for large data.frames but replaces the "nothing" values by zeros.

nothing_zero <- function(x){
  tc <- textConnection("nothing", "w")
  sink(tc)   # divert output to tc connection
  print(x)   # print in string "nothing" instead of console
  sink()     # set the output back to console
  close(tc)  # close connection
  tc <- textConnection(nothing, "r")
  y <- read.table(tc, na.strings = "nothing", header = TRUE)
  close(tc)  # close connection
  y[is.na(y)] <- 0
  y
}

nothing_zero(df1)
#  Group Candy
#1     A     5
#2     B     0

The main advantage is to read numeric data as numeric.

str(nothing_zero(df1))
#'data.frame':  2 obs. of  2 variables:
# $ Group: chr  "A" "B"
# $ Candy: num  5 0

Data

df1 <- read.table(text = "
Group   Candy
A   5
B   nothing", header = TRUE)

CodePudding user response:

sapply(df,function(x) {x <- gsub("nothing",0,x)})

Output

     a  
[1,] "0"
[2,] "5"
[3,] "6"
[4,] "0"

Data

df <- structure(list(a = c("nothing", "5", "6", "nothing")),
                class = "data.frame",
                row.names = c(NA,-4L))

Another option

df[] <- lapply(df, gsub, pattern = "nothing", replacement = "0", fixed = TRUE)

If you are only wanting to apply to one column

library(tidyverse)

df$a <- str_replace(df$a,"nothing","0")

Or applying to one column in base R

df$a <- gsub("nothing","0",df$a)
  • Related