Home > Blockchain >  Divide a group of n columns using the the first n columns in r
Divide a group of n columns using the the first n columns in r

Time:02-05

I have a data frame like this:

df <- data.frame(a = c(1,2,3,4), 
                 b = c(5,6,7,8), 
                 c = c(22,33,44,55), 
                 d = c(7,8,9,10), 
                 e = c(2,3,4,5), 
                 f = c(99,88,66,44))
df
  a b  c  d e  f
1 1 5 22  7 2 99
2 2 6 33  8 3 88
3 3 7 44  9 4 66
4 4 8 55 10 5 44

I hope to divide every 2 columns (or multiple of n) by the first two (or n) columns:

df$c <- df$c / df$a
df$d <- df$d / df$b
df$e <- df$e / df$a
df$f <- df$f / df$b
df$a <- df$a / df$a
df$b <- df$b / df$b

resulting in:


df
  a b        c        d        e         f
1 1 1 22.00000 1.400000 2.000000 19.800000
2 1 1 16.50000 1.333333 1.500000 14.666667
3 1 1 14.66667 1.285714 1.333333  9.428571
4 1 1 13.75000 1.250000 1.250000  5.500000

is there a way to do this more simply and/or with dplyr?

CodePudding user response:

We could replicate the columns and divide in base R

n <- 2
df <- df/df[seq_len(n)][rep(seq_len(n), ncol(df)/n)]

-output

> df
  a b        c        d        e         f
1 1 1 22.00000 1.400000 2.000000 19.800000
2 1 1 16.50000 1.333333 1.500000 14.666667
3 1 1 14.66667 1.285714 1.333333  9.428571
4 1 1 13.75000 1.250000 1.250000  5.500000

Or with dplyr, use across to loop over the alternate columns to do the division

library(dplyr)
df %>%
   mutate(across(seq(1, ncol(.), 2), ~ .x/a),
      across(seq(2, ncol(.), 2), ~ .x/b))

-output

a b        c        d        e         f
1 1 1 22.00000 1.400000 2.000000 19.800000
2 1 1 16.50000 1.333333 1.500000 14.666667
3 1 1 14.66667 1.285714 1.333333  9.428571
4 1 1 13.75000 1.250000 1.250000  5.500000

Or use a single across with a if/else condition

df %>%
   mutate(across(everything(),
    ~ if(match(cur_column(), names(df)) %% 2 == 0) .x/b else .x/a))

-output

a b        c        d        e         f
1 1 1 22.00000 1.400000 2.000000 19.800000
2 1 1 16.50000 1.333333 1.500000 14.666667
3 1 1 14.66667 1.285714 1.333333  9.428571
4 1 1 13.75000 1.250000 1.250000  5.500000

Or another approach is to split the data into chunks and divide

library(purrr)
library(magrittr)
df %>% 
   split.default(as.integer(gl(ncol(.), n, ncol(.)))) %>% 
   unname %>% 
   map(~ df %>% 
       select(seq_len(n)) %>% 
       divide_by(.x, .)) %>%
   list_cbind()

-output

 a b        c        d        e         f
1 1 1 22.00000 1.400000 2.000000 19.800000
2 1 1 16.50000 1.333333 1.500000 14.666667
3 1 1 14.66667 1.285714 1.333333  9.428571
4 1 1 13.75000 1.250000 1.250000  5.500000
  • Related