Home > other >  Data.table replace sequence of values with NA
Data.table replace sequence of values with NA

Time:02-01

I want a specific sequence of values to be replaced by NA in a data table using a variable as column indicator.

Example:

    dt <- data.table(col1=1:10, col2=11:20)
    specific_column <- 2

Output: 
      col1 col2
 1:    1   11
 2:    2   12
 3:    3   13
 4:    4   14
 5:    5   15
 6:    6   16
 7:    7   17
 8:    8   18
 9:    9   19
10:   10   20

I want to specifically replace values with NA in col2 and position 2:5 to get the following output:

 col1 col2
 1:    1   11
 2:    2   NA
 3:    3   NA
 4:    4   NA
 5:    5   NA
 6:    6   16
 7:    7   17
 8:    8   18
 9:    9   19
10:   10   20

I am able to select the values I am interested in with:

   dt[2:5,..specific_column]

Unfortunately it is not possible to use the replacement method from data frames:

dt[2:5,..specific_column] <- NA
#Error in `[<-.data.table`(`*tmp*`, 2:5, ..specific_column, value = NA) : 
  object '..specific_column' not found

The only work-around I found was:

dt[2:5,print(specific_column)] <- NA

This works, but as it always prints the "specific column" it slows down the progress by a lot. (I am using a larger data set (10000rows, 28columns))

Is there a simple solution comparable to the one used for data frames?

Thanks a lot in advance.

CodePudding user response:

Try to use

dt[2:5, (specific_column) := NA]

CodePudding user response:

there is a set() function in data.table

library(data.table)
DT <- data.table(col1=1:10, col2=11:20)
specific_column <- 2L
set(DT, i=3:5, specific_column, NA)
DT

CodePudding user response:

From data.table version 1.14.3 (now in development), there is a new env argument for programmatic control of data.table calls (see here). Might be overkill for this simple situation, but I show how it would be used here because it is a potentially useful feature for more demanding situations, and therefore good to be aware of:

dt[2:5, mycol := NA, env = list(mycol = specific_column)]

Note, you can install the latest development version using data.table::update.dev.pkg()

  •  Tags:  
  • Related