Home > Mobile >  How do I add a column to a data.table and return multiple columns without modifying underlying data?
How do I add a column to a data.table and return multiple columns without modifying underlying data?

Time:12-21

I have the following data.table in R

dt <- data.table(gender = c("Male", "Female"), Prop = c(0.49, 0.51))
#   gender Prop
# 1:   Male  0.49
# 2: Female  0.51

I want to calculate a Freq = Prop * 1000 column and then return just the gender and Freq columns. How can I do this in a single line of code and without explicitly referring to the gender column and without modifying dt?

The best I can manage is:

onsdist$gender[, c(.SD, Freq = Prop * 1000)][, .SD, .SDcols = - "Prop"]
#    gender Freq1 Freq2
# 1:   Male   490   490
# 2: Female   510   510

but I've ended up with a duplicated Freq column.

(The reason I don't want to refer to gender is because it changes across data.tables. The reason I don't want to modify dt is because I need to re-use the original version later).

CodePudding user response:

Use transform with Prop = NULL

dt[, transform(.SD, Freq = Prop * 1000, Prop = NULL)]
##    gender Freq
## 1:   Male  490
## 2: Female  510

or this variation

transform(dt, Freq = Prop * 1000, Prop = NULL)
##    gender Freq
## 1:   Male  490
## 2: Female  510

CodePudding user response:

We can use the data.table syntax to get the output format

dt[, c(.SD, .(Freq = Prop * 1000)), .SDcols = -"Prop"]

-output

   gender Freq
1:   Male  490
2: Female  510
  • Related