Home > database >  How to create a function to do an operation on multiple columns to create new columns in R
How to create a function to do an operation on multiple columns to create new columns in R

Time:10-05

I'm trying to apply a simple operation to multiple columns in a dataset.

I tried using a function like the one explained in Applying function on multiple columns to create multiple new columns, but I haven't been able to get it.

here is the equations, where wt is the set of column named wt_jan, wt_feb ... wt_dec


ne_maintain = 0.386*wt^0.75

I tried using mutate() but have not been able to figure out how to repeat my calulation multiple times.

here is what the dataset looks like:

lactation wt_jan wt_feb wt_mar wt_apr wt_may wt_jun wt_jul wt_aug wt_sep wt_oct wt_nov wt_dec
1         1  600.0  612.5  625.0  637.5  643.8  650.0  656.3  662.5  668.8  675.0  681.3  687.5
2         2  693.8  700.0  706.3  712.5  715.6  718.8  721.9  725.0  728.1  731.3  734.4  737.5
3         3  740.6  743.8  746.9  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0
4         4  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0
5         5  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0
6         6  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0  750.0

and what I would like in the end is a new set of columns named ne_maintenance_jan ... all the way to December. I have about 10 calculation i need to do in total so hoping that this post solves it!

CodePudding user response:

Maybe one option is using dplyr and stringr, as follows:

data_frame %>% 
  gather(key = months, value = wt, -lactation) %>% 
  select(-lactation) %>% 
  filter(!is.na(wt)) %>%
  group_by(months) %>% 
  mutate(
    months = str_replace(months, "wt", "ne_maintenance"),
    wt = mean(0.386*wt^0.75)) %>% 
  distinct(months, .keep_all = TRUE)

OUTPUT

   months                wt
   <chr>              <dbl>
 1 ne_maintenance_jan  53.3
 2 ne_maintenance_feb  53.5
 3 ne_maintenance_mar  53.7
 4 ne_maintenance_apr  53.9
 5 ne_maintenance_may  54.0
 6 ne_maintenance_jun  54.1
 7 ne_maintenance_jul  54.2
 8 ne_maintenance_aug  54.3
 9 ne_maintenance_sep  54.4
10 ne_maintenance_oct  54.4
11 ne_maintenance_nov  54.5
12 ne_maintenance_dec  54.6

CodePudding user response:

If I understand correctly, you want to append newly calculated columns your current table. You could try using mutate with across like this:

library(tidyverse)

# a little like your df
lactation = seq(1:6)
wt_jan <- runif(6,min=600, max=700)
wt_feb <- runif(6,min=600, max=800)
wt_mar <- runif(6,min=800, max=900)
df <- data.frame(lactation,wt_jan,wt_feb,wt_mar)


df_new<- df%>% 
 mutate(across(contains("wt"), #select columns containing 'wt'
               ~((0.386*.x)^0.75), #using anonymized function here
               .names = "{sub('wt', 'ne_maintenance', col)}")) #define the names of new columns 
  • Related