Home > other >  data.table: temporarily rename single column while keeping all others
data.table: temporarily rename single column while keeping all others

Time:08-30

Consider the table below:

DT <- data.table(x = 1, y = 10, z = 100, w = 1000)

I want to rename the column z to y, while keeping one or more columns in the table. The columns and renaming would be specified by variables. For example:

keep <- c("x", "w")
old <- "z"
new <- "y"
# Manually:
DT[, .(x, w, y = z)] 

In addition to that, I want the original y column to still exist. That is, I want the renaming to be temporary, and to be able to go back to the original table.

Solutions I tried:

  1. Partially manual with ... But returns something I did not expect and do not understand why that was returned.
DT[, .(..keep, y = z)]
>    ..keep   y
> 1:      x 100
> 2:      w 100
  1. Selecting everything but the column that would be replaced, then renaming with setnames:
keep <- setdiff(colnames(DT), "y")
newDT <- DT[, ..keep]
data.table::setnames(new_DT, old, new)
print(newDT)
>   x   y    w
> 1: 1 100 1000

The second option works as expected, but it does not copy values by reference.

  1. Using dplyr::select and dplyr::rename.
newDT <- dplyr::rename(dplyr::select(DT, -y), y = z)

Works as expected, but copies are not by reference, and I need dplyr instead of just using data.table.

Question: Is there a way of temporarily renaming one column while keeping other columns around, without copying data (done by reference)? That is, is there a way to recreate a table by reference for most columns, except change one of them.

Thanks for helping! Please let me know is something is not clear.

CodePudding user response:

We could use mget to return the values of 'keep' in a list and then concatenate (c) with the list renamed (.(y = z))

 DT[,  c(mget(keep), .(y = z))]

-output

      x     w     y
   <num> <num> <num>
1:     1  1000   100

CodePudding user response:

I could not find a one-liner that accomplishes the answer by reference, but I was able to use setnames to obtain a solution that does not copy the data.

Consider the original data:

DT <- data.table(x = 1, y = 10, z = 100, w = 1000)
.Internal(inspect(DT))
> @1b588e10 19 VECSXP g0c7 [OBJ,REF(7),ATT] (len=4, tl=1028)
  @1510aea0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 1
  @1510b1b0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 10
  @1510b4f8 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 100
  @151079a0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 1000

To rename z into y without losing y and not copying any data, we need to:

  1. Rename the original y into something else with setnames (this is done by reference):
setnames(DT, "y", "y_original")
.Internal(inspect(DT))
> @1b588e10 19 VECSXP g0c7 [OBJ,REF(17),ATT] (len=4, tl=1028)
  @1510aea0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 1
  @1510b1b0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 10
  @1510b4f8 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 100
  @151079a0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 1000
  1. Rename z into y with setnames:
setnames(DT, "z", "y")
.Internal(inspect(DT))
> @1b588e10 19 VECSXP g0c7 [OBJ,REF(27),ATT] (len=4, tl=1028)
  @1510aea0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 1
  @1510b1b0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 10
  @1510b4f8 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 100
  @151079a0 14 REALSXP g0c1 [REF(2)] (len=1, tl=0) 1000

Then, the resulting table DT still has all columns in keep, the column z was renamed to y, the original column y is still available (but now y_original), and no data was copied in the process.

  • Related