Home > Net >  How to reflect the new varibale created by R function to the dataset
How to reflect the new varibale created by R function to the dataset

Time:12-08

I wrote following R function to make dummy variables.

For example, I made a dataset(dt, only a variable "var"), and used this function to creat a new variable ("dummy"), which is the quartile dummy variable of "var". However, the dt dataset has only a variable "var" after running the function, the new varibale could not be add to the dataset(dt).

How could I add the new varibale from R function to the dataset? Or It is not a good idea to creat new varibale by R function?

dv <- function(dummy,variable,n){
    nn <- n - 1
    dummy <- cut(variable,
                 quantile(variable,
                          probs = seq(0, 1, 1/n),
                          na.rm = TRUE
                 ),
                 labels = c(0:nn),
                 include.lowest = TRUE
    )
    tapply(variable, dummy, summary) 
}

set.seed(1234)
dt <- data.table(var = runif(20, min  = 0, max = 100) )
dv(dt$dummy,dt$var,4)

CodePudding user response:

If I understand correctly, I think this is what you want to do.

(note that I edited your function)

library(data.table)
dv <- function(variable,n){
  nn <- n - 1
  dummy <- cut(variable,
               quantile(variable,
                        probs = seq(0, 1, 1/n),
                        na.rm = TRUE
               ),
               labels = c(0:nn),
               include.lowest = TRUE
  )

  return(dummy)
  }

set.seed(1234)
dt <- data.table(var = runif(20, min  = 0, max = 100) )

dt[,dummy:=dv(dt$var,4)]

> dt

           var dummy
 1: 11.3703411     0
 2: 62.2299405     2
 3: 60.9274733     2
 4: 62.3379442     2
 5: 86.0915384     3
 6: 64.0310605     2
 7:  0.9495756     0
 8: 23.2550506     0
 9: 66.6083758     3
10: 51.4251141     1
11: 69.3591292     3
12: 54.4974836     2
13: 28.2733584     1
14: 92.3433484     3
15: 29.2315840     1
16: 83.7295628     3
17: 28.6223285     1
18: 26.6820780     1
19: 18.6722790     0
20: 23.2225911     0

CodePudding user response:

dt$newColname<-dv(dt$dummy,dt$var,4)

enter image description here

  •  Tags:  
  • r
  • Related