Home > Mobile >  Multiply a set of columns in data frame R to another set of columns
Multiply a set of columns in data frame R to another set of columns

Time:09-01

I have a data.frame with 156 variables and I would like to multiply a subset of those variables to another subset of the 156. How can I do that: The variables are of the following form:

ID||quantity_1||quantity_2||...||quantity_156||priceperunit_q1||...priceperunit||q156

essentially I would like to multiply each quantity by its priceperunit

CodePudding user response:

Given you first have all the quantity columns and then all the priceperunit columns in the correct order, you can make two different data sets and multiply them using mapply, like this. The result set will contain the product of the two columns, rename this if needed:

quantity_1 <- c(1, 2, 3)
quantity_2 <- c(1, 2, 3)
quantity_3 <- c(1, 2, 3)
quantity_40 <- c(1, 2, 3)
priceperunit_1 <- c(20, 20, 20)
priceperunit_2 <- c(30, 30, 30)
priceperunit_3 <- c(15, 15, 15)
priceperunit_40 <- c(1.25, 1.25, 1.65)

df <- data.frame(quantity_1, quantity_2, quantity_3, quantity_40, 
                 priceperunit_1, priceperunit_2, priceperunit_3, priceperunit_40)
df
#>   quantity_1 quantity_2 quantity_3 quantity_40 priceperunit_1 priceperunit_2
#> 1          1          1          1           1             20             30
#> 2          2          2          2           2             20             30
#> 3          3          3          3           3             20             30
#>   priceperunit_3 priceperunit_40
#> 1             15            1.25
#> 2             15            1.25
#> 3             15            1.65

as.data.frame(mapply("*", 
                     df[, grep("^quantity", names(df))], 
                     df[, grep("^priceperunit", names(df))]))
#>   quantity_1 quantity_2 quantity_3 quantity_40
#> 1         20         30         15        1.25
#> 2         40         60         30        2.50
#> 3         60         90         45        4.95

CodePudding user response:

A simple solution would be to reshape your data from wide to long format, see ?reshape. This will also help you keeping your original data format.

The trick is to store all your price and all your quantity data into the same 2 columns, but records will be differentiated thanks to the newly created time variable of reshape (1 to 156 according to your post).

You can then compute the volume (price x quantity), and go back to the wide format if desired.

See the reproducible example below.

df <- data.frame(price1 = 1:10,
                 price2 = 11:20,
                 quantity1 = c(1:5, 1:5),
                 quantity2 = c(5:9, 5:9))

df_long <- reshape(df,
                   direction = "long",
                   v.names = c("price", "quantity"),
                   varying = list(paste0("price", 1:2), paste0("quantity", 1:2)),
                   times = c("1", "2"))

df_long$volume <- df_long$price*df_long$quantity

df_wide <- reshape(df_long,
                   direction = "wide",
                   v.names = c("price", "quantity", "volume"),
                   timevar = "time",
                   sep = "")
  •  Tags:  
  • r
  • Related