Home > Blockchain >  Using apply functions to perform different operations for each column in a data frame
Using apply functions to perform different operations for each column in a data frame

Time:12-13

I have the following two data frames.

a <- c(3,2,6,7,5)
b <- c(2,5,7,8,1)
d <- c(3,6,2,1,6)

df <- data.frame(a, b, d)

a1 <- c("a", "H1")
b1 <- c("b", "H2")
d1 <- c("d", "H1")

df_2 <- data.frame(a1, b1, d1)

Conveniently, the column names in df match row 1 of df_2 on a column basis. I want to use the df_2 data frame to alter the columns in df. For this example I just want to multiply each column by a different factor. If the column header in df matches with H1 in df_2 I want to multiply that column by 2. If the column header in df matches with H2 in df_2 I want to multiply by that column by 3. I can do this one column at a time with the following code.

#How to change column 1
df[,1] <- if (df_2[2,1] == "H1") {
  df[,1]*2 
} else if (df_2[2,1] == "H2") {
  df[,1]*3
}

#How to change column 2
df[,2] <- if (df_2[2,2] == "H1") {
  df[,2]*2 
} else if (df_2[2,2] == "H2") {
  df[,2]*3
}


#How to change column 3
df[,3] <- if (df_2[2,3] == "H1") {
  df[,3]*2 
} else if (df_2[2,3] == "H2") {
  df[,3]*3
}

How can I use apply functions (preferred) or a for loop to do these calculation on all columns at once? I'm also open to other more elegant solutions.

CodePudding user response:

One approach - convert the 1st row of the 'df_2' as column names (so that we can select the column from the column name of first dataset) with janitor::row_to_names, loop across the columns of 'df_2_new', multiply by 2 or 3 based on the value of the extracted column from df_2_new

library(dplyr)
library(janitor)
df_2_new <- row_to_names(df_2, 1)

df %>%
   mutate(across(all_of(names(df_2_new)), 
    ~ case_when(df_2_new[[cur_column()]] == "H1"~ .x *2, 
             df_2_new[[cur_column()]] == "H2" ~ .x * 3, TRUE ~ .x)))

-output

  a  b  d
1  6  6  6
2  4 15 12
3 12 21  4
4 14 24  2
5 10  3 12

CodePudding user response:

in Base R you could use:

fun <-function(x, y)switch(x, H1 = y*2, H2 = y*3)

mapply(fun, setNames(df_2[2,],df_2[1,])[names(df)],df)

      a  b  d
[1,]  6  6  6
[2,]  4 15 12
[3,] 12 21  4
[4,] 14 24  2
[5,] 10  3 12
  •  Tags:  
  • r
  • Related