Home > Software engineering >  Using for loop to multiply variables and create product variables
Using for loop to multiply variables and create product variables

Time:11-28

I need to write a for loop to calculate the product of year variables (e.g. var1874) * price variables (e.g. num1874), creating a new variable for each year and its corresponding price value (e.g. newvar1874).

A tibble: 4 x 7
    cty var1874 var1875 var1876 num1874 num1875 num1876
  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1     1    0.78    0.83    0.99    2.64    2.8     3.1 
2     2    0.69    0.69    0.89    2.3     2.3     2.58
3     3    0.42    0.48    0.59    2.28    2.44    2.64
4     4    0.82    0.94    1.09    2.28    2.36    3   

# Here's the code I have so far...
       
var.num_ <- dim(mydata)[2]
for(i in 1:length(vars)) {
  mydata[, var.num   i] <- recode(mydata[, vars[i]], "newvar")
  names(mydata)[var.num   i] <- paste0("newvar", vars[i])
} 
 
# This code generates new variables, but they all have NA values.
# I also get this error message: 

Unreplaced values treated as NA as .x is not compatible. 
Please specify replacements exhaustively or supply .default  

# Does anyone have any tips or general suggestions on how to use 
# a for loop to multiply variables and create new product variables?

CodePudding user response:

You could simply do the subset then multiply:

df[2:4]*df[5:7]

  var1874 var1875 var1876
1  2.0592  2.3240  3.0690
2  1.5870  1.5870  2.2962
3  0.9576  1.1712  1.5576
4  1.8696  2.2184  3.2700

If you do not know the number of columns, but the data is arranged as given, then you could simply do:

df %>%
  transmute(across(starts_with('var')) * across(starts_with('num')))
  var1874 var1875 var1876
1  2.0592  2.3240  3.0690
2  1.5870  1.5870  2.2962
3  0.9576  1.1712  1.5576
4  1.8696  2.2184  3.2700

What if the data is disorganized? ie you are not sure that the way the year is aranged is the way num is arranged. Then do:

df %>%
  pivot_longer(-cty, names_to = c('.value', 'grp'), 
               names_pattern = '(\\D )(\\d{4})') %>%
  mutate(newvar = var * num) %>%
  pivot_wider(cty, grp, values_from = newvar, names_prefix = 'newvar')

   cty newvar1874 newvar1875 newvar1876
  <int>      <dbl>      <dbl>      <dbl>
1     1      2.06        2.32       3.07
2     2      1.59        1.59       2.30
3     3      0.958       1.17       1.56
4     4      1.87        2.22       3.27

In base R, the same can be done as:

CodePudding user response:

Using base R we can take advantage of vector multiplication.

# Data --------------------------------------------------------------------

df <- data.frame(cty = 1:4,
                var1874 = c(.78, .69, .42, .82),
                var1875 = c(.83, .69, .48, .94),
                var1876 = c(.99, .89, .59, 1.09),
                num1874 = c(2.64, 2.3, 2.28, 2.28),
                num1875 = c(2.8, 2.3, 2.44, 2.36),
                num1876 = c(3.1, 2.58, 2.64, 3))


# code --------------------------------------------------------------------

nms_var <- paste0(c('var187'), 4:6)
nms_num <- gsub('var', 'num', nms_var)
nms_result <- gsub('var', 'new_var', nms_var)

for (i in 1:length(nms_var)) {
  df[, nms_result[[i]]] <- df[, nms_var[i]] * df[, nms_num[i]]
}

df
#>   cty var1874 var1875 var1876 num1874 num1875 num1876 new_var1874 new_var1875
#> 1   1    0.78    0.83    0.99    2.64    2.80    3.10      2.0592      2.3240
#> 2   2    0.69    0.69    0.89    2.30    2.30    2.58      1.5870      1.5870
#> 3   3    0.42    0.48    0.59    2.28    2.44    2.64      0.9576      1.1712
#> 4   4    0.82    0.94    1.09    2.28    2.36    3.00      1.8696      2.2184
#>   new_var1876
#> 1      3.0690
#> 2      2.2962
#> 3      1.5576
#> 4      3.2700

Created on 2021-11-27 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related