Home > database >  Loops and creating new variables in R
Loops and creating new variables in R

Time:01-10

I have a dataset that has multiple years and variables. I would specify how many of each, but I'm trying to create a script that can run without having to copy and paste each block for every year/variable so hopefully the code would work regardless of those specifications. Basically, for each variable I have an inflated counterpart, like INCOME and INCOME_INFLATED and I want to create a manually-inflated version of INCOME (INCOME_MANUAL) and compare it to INCOME_INFLATED.

Essentially, here is an example of my input data:

year income income_inflated CPIU
2000 1500 3000 2
2001 1000 1500 1.5
2002 2000 6000 3

Here is what I would like my output data to look like:

year income income_inflated CPIU income_manual
2000 1500 3000 2 3000
2001 1000 1500 1.5 1500
2002 2000 6000 3 6000

Where income_manual is income x CPIU. CPIU is a numeric variable with a unique value for each year. This is very easy for one or two variables, but I am having trouble figuring out how to make this happen for a list of 40 variables without having to copy and paste the code for each variable.

I can create a list of relevant variables easily, I just don't know how to create a loop that allows for the naming and creation of new variable, so the user can just input their data file and run it.

This code successfully creates new data files filtered by year named "data_[YEAR]". (years is a list of unique values in variable YEAR.)

for (y in years[]) {
  dy <- data %>% filter(YEAR == y)
  assign(paste0("data_", y), dy)
}
remove(dy)

But, when I try to apply the same logic to a variable, it doesn't work. (vars is a list of relevant variables.)

for (v in vars[]) {
  data <- data %>% mutate(x = v * CPIU)
  assign(paste0(v, "_manual"), data$x)
}

It gives me the following error:

Error in `mutate()`:
! Problem while computing `x = v * CPIU`.
Caused by error in `v * CPIU`:
! non-numeric argument to binary operator

I'm fairly used to doing these "creating new objects" operations in bash scripts, but not as much in R, so I'm not sure how to call on that kind of "dictionary". Essentially, how can I get R to understand "v" as the actual variable instead of the variable's name as a character string?

Essentially, I want to do the following operation:

data$income_manual <- data$income * data$CPIU

for many variables without having to copy and paste this line over and over.

Let me know if more detail or background is needed! Thanks so much.

I also know there are a lot of questions on here that are similar to this one, but I can't figure out how to adapt it into my own work. I am still relatively new to R, so I apologize for being a bit confused.

CodePudding user response:

IIUC - You can assign new columns of data frame from a block of variables using matrix operations:

relevant_vars <- c("income", ...)

data[paste0(relevant_vars, "_manual")] <- data[relevant_vars] * data$CPIU

To demonstrate with mtcars:

relevant_vars <- names(mtcars)
mtcars$CPIU <- runif(nrow(mtcars))
  
mtcars[paste0(relevant_vars, "_manual")] <- mtcars[relevant_vars] * mtcars$CPIU

str(mtcars)
'data.frame':   32 obs. of  24 variables:
 $ mpg        : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl        : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp       : num  160 160 108 258 360 ...
 $ hp         : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat       : num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt         : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec       : num  16.5 17 18.6 19.4 17 ...
 $ vs         : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am         : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear       : num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb       : num  4 4 1 1 2 1 4 2 2 4 ...
 $ CPIU       : num  0.699 0.616 0.111 0.658 0.957 ...
 $ mpg_manual : num  14.68 12.93 2.54 14.09 17.9 ...
 $ cyl_manual : num  4.194 3.695 0.446 3.95 7.658 ...
 $ disp_manual: num  111.8 98.5 12 169.9 344.6 ...
 $ hp_manual  : num  76.9 67.7 10.4 72.4 167.5 ...
 $ drat_manual: num  2.726 2.402 0.429 2.028 3.015 ...
 $ wt_manual  : num  1.831 1.771 0.259 2.117 3.293 ...
 $ qsec_manual: num  11.51 10.48 2.07 12.8 16.29 ...
 $ vs_manual  : num  0 0 0.111 0.658 0 ...
 $ am_manual  : num  0.699 0.616 0.111 0 0 ...
 $ gear_manual: num  2.796 2.464 0.446 1.975 2.872 ...
 $ carb_manual: num  2.796 2.464 0.111 0.658 1.915 ...
  • Related