I need to write a for loop to calculate the product of year variables (e.g. var1874) * price variables (e.g. num1874), creating a new variable for each year and its corresponding price value (e.g. newvar1874).
A tibble: 4 x 7
cty var1874 var1875 var1876 num1874 num1875 num1876
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0.78 0.83 0.99 2.64 2.8 3.1
2 2 0.69 0.69 0.89 2.3 2.3 2.58
3 3 0.42 0.48 0.59 2.28 2.44 2.64
4 4 0.82 0.94 1.09 2.28 2.36 3
# Here's the code I have so far...
var.num_ <- dim(mydata)[2]
for(i in 1:length(vars)) {
mydata[, var.num i] <- recode(mydata[, vars[i]], "newvar")
names(mydata)[var.num i] <- paste0("newvar", vars[i])
}
# This code generates new variables, but they all have NA values.
# I also get this error message:
Unreplaced values treated as NA as .x is not compatible.
Please specify replacements exhaustively or supply .default
# Does anyone have any tips or general suggestions on how to use
# a for loop to multiply variables and create new product variables?
CodePudding user response:
You could simply do the subset then multiply:
df[2:4]*df[5:7]
var1874 var1875 var1876
1 2.0592 2.3240 3.0690
2 1.5870 1.5870 2.2962
3 0.9576 1.1712 1.5576
4 1.8696 2.2184 3.2700
If you do not know the number of columns, but the data is arranged as given, then you could simply do:
df %>%
transmute(across(starts_with('var')) * across(starts_with('num')))
var1874 var1875 var1876
1 2.0592 2.3240 3.0690
2 1.5870 1.5870 2.2962
3 0.9576 1.1712 1.5576
4 1.8696 2.2184 3.2700
What if the data is disorganized? ie you are not sure that the way the year is aranged is the way num is arranged. Then do:
df %>%
pivot_longer(-cty, names_to = c('.value', 'grp'),
names_pattern = '(\\D )(\\d{4})') %>%
mutate(newvar = var * num) %>%
pivot_wider(cty, grp, values_from = newvar, names_prefix = 'newvar')
cty newvar1874 newvar1875 newvar1876
<int> <dbl> <dbl> <dbl>
1 1 2.06 2.32 3.07
2 2 1.59 1.59 2.30
3 3 0.958 1.17 1.56
4 4 1.87 2.22 3.27
In base R, the same can be done as:
CodePudding user response:
Using base R we can take advantage of vector multiplication.
# Data --------------------------------------------------------------------
df <- data.frame(cty = 1:4,
var1874 = c(.78, .69, .42, .82),
var1875 = c(.83, .69, .48, .94),
var1876 = c(.99, .89, .59, 1.09),
num1874 = c(2.64, 2.3, 2.28, 2.28),
num1875 = c(2.8, 2.3, 2.44, 2.36),
num1876 = c(3.1, 2.58, 2.64, 3))
# code --------------------------------------------------------------------
nms_var <- paste0(c('var187'), 4:6)
nms_num <- gsub('var', 'num', nms_var)
nms_result <- gsub('var', 'new_var', nms_var)
for (i in 1:length(nms_var)) {
df[, nms_result[[i]]] <- df[, nms_var[i]] * df[, nms_num[i]]
}
df
#> cty var1874 var1875 var1876 num1874 num1875 num1876 new_var1874 new_var1875
#> 1 1 0.78 0.83 0.99 2.64 2.80 3.10 2.0592 2.3240
#> 2 2 0.69 0.69 0.89 2.30 2.30 2.58 1.5870 1.5870
#> 3 3 0.42 0.48 0.59 2.28 2.44 2.64 0.9576 1.1712
#> 4 4 0.82 0.94 1.09 2.28 2.36 3.00 1.8696 2.2184
#> new_var1876
#> 1 3.0690
#> 2 2.2962
#> 3 1.5576
#> 4 3.2700
Created on 2021-11-27 by the reprex package (v2.0.1)