Home > Software engineering >  Most efficient way to mutate multiple new columns at once using the same formula but different exist
Most efficient way to mutate multiple new columns at once using the same formula but different exist

Time:02-25

Say I have this dataframe 'mydata':

mydata <- data.frame(x = 1:5, y = -2:2, z=8:12, a = 7)

mydata
  x  y  z a
1 1 -2  8 7
2 2 -1  9 7
3 3  0 10 7
4 4  1 11 7
5 5  2 12 7

I want to create a new column that subtracts column 'x' from column 'a', and then do the same thing for columns y and z. If possible, I would then like columns x, y and z to be removed from the data frame. This would be the ideal resulting data frame:

  a new_x new_y new_z
1 7    -6    -9     1
2 7    -5    -8     2
3 7    -4    -7     3
4 7    -3    -6     4
5 7    -2    -5     5

I have about 30 columns to do this and about 10,000 rows. I am currently using mutate for each new column, like this:

mydata <- mydata %>% mutate(new_x = x-a, new_y = y-a, new_z = z-a)

Is there an efficient way to accomplish this (preferably via dplyr) that isn't so repetitive?

Thank you!!

CodePudding user response:

We can use across to loop over the columns, rename the columns with .name by adding the prefix new_. By default, it returns the original columns as well unless we make use of .keep

library(dplyr)
mydata %>% 
  mutate(a, across(x:z, ~ .x - a, .names = 'new_{.col}'), .keep  = 'unused')

-output

  a new_x new_y new_z
1 7    -6    -9     1
2 7    -5    -8     2
3 7    -4    -7     3
4 7    -3    -6     4
5 7    -2    -5     5
  • Related