Home > database >  How to mutate across a dataframe using dimensions instead of column names?
How to mutate across a dataframe using dimensions instead of column names?

Time:05-04

The following bit of code does a nice job of calculating and outputting the proportions of row totals represented by each element in the dataframe named data:

Output:

# A tibble: 4 x 5
# Rowwise: 
  To        A     B     C   Sum
  <chr> <dbl> <dbl> <dbl> <dbl>
1 A     0.111 0.333 0.556     1
2 B     0.167 0.333 0.5       1
3 C     0.25  0.312 0.438     1
4 Sum   0.189 0.324 0.486     1

Code generating the above:

library(dplyr)
library(tidyverse)

data <- 
  data.frame(
    To = c("A","B","C"),
    A = c(1,2,4),
    B = c(3,4,5),
    C = c(5,6,7)
  )

data <- data %>% 
  replace(is.na(.), 0) %>%
  bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))

data <- cbind(data, Sum = rowSums(data[,-1]))

data %>% 
  rowwise() %>%
  mutate(across(A:Sum, ~ sum(.) / Sum))

The mutate(across(...)...) above references the starting column A as the calculation starting point, which is correct. However, in the larger App this is intended for, the column names are dynamic. So I'd simply like to start with the first numeric column in the dataframe instead of using its name. Below is my attempt to do this:

data %>% 
  rowwise() %>%
  mutate(across(-1:Sum, ~ sum(.) / Sum))

It calculates correctly but gives me the warning shown below. Is there a better way to do this? Instead of suppressWarnings() which is tempting?

Warning message: Problem with mutate() input ..1. i ..1 = across(-1:Sum, ~sum(.)/Sum). i numerical expression has 4 elements: only the first used i The warning occurred in row 1.

CodePudding user response:

You can use a predicate function, wrapped in where()

data %>% 
  rowwise() %>%
  mutate(across(where(is.numeric), ~ sum(.) / Sum))

You could also just de-select the first column with either its name or its index like these example:

data %>% 
  rowwise() %>%
  mutate(across(-1, ~ sum(.) / Sum))

data %>% 
  rowwise() %>%
  mutate(across(-To, ~ sum(.) / Sum))
  • Related