Home > front end >  How to mutate across a dataframe using dynamic dimension references?
How to mutate across a dataframe using dynamic dimension references?

Time:05-18

Suppose we start with the data dataframe shown below, with the generating code immediately beneath:

> data
   To A  B  C
1   A 1  3  5
2   B 2  4  6
3   C 4  5  7
4 Sum 7 12 18
   
library(dplyr)
data <- 
   data.frame(
     To = c("A","B","C"),
     A = c(1,2,4),
     B = c(3,4,5),
     C = c(5,6,7)
   ) %>% 
   bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))

I've been running the following bit of code to change the values in a dataframe much like it into percentages (each cell calculated as a % of its respective column "Sum"):

data %>% mutate(across(-1, ~ ./.[To == "Sum"]))

Giving the correct output of :

> data 
   To         A         B         C
1   A 0.1428571 0.2500000 0.2777778
2   B 0.2857143 0.3333333 0.3333333
3   C 0.5714286 0.4166667 0.3888889
4 Sum 1.0000000 1.0000000 1.0000000

How would the divisor of [To == "Sum"] in the mutate(across...)) be swapped with a dynamic (perhaps numeric) reference to the last row of the dataframe instead? In the fuller code this is deployed in the dataframe dynamically and reactively expands/contracts based on user input into Shiny, and column header names also change dynamically so the fixed reference to the "To" header name in this example isn't optimal.

A brief explanation would be helpful too, so I don't have to keep coming to the trough for solutions.

CodePudding user response:

We can use last function to return the last value of the column. Also, there is another option with nth where we can get the value of the column based on the index which is more general i.e. nth(., 2) returns the 2nd value

library(dplyr)
data %>% 
   mutate(across(-1, ~ ./last(.)))

-output

 To         A         B         C
1   A 0.1428571 0.2500000 0.2777778
2   B 0.2857143 0.3333333 0.3333333
3   C 0.5714286 0.4166667 0.3888889
4 Sum 1.0000000 1.0000000 1.0000000

CodePudding user response:

Try this: It is more or less what akrun already is presenting:

library(dplyr)

data <- 
  data.frame(
    To = c("A","B","C"),
    A = c(1,2,4),
    B = c(3,4,5),
    C = c(5,6,7)
  ) %>% 
  mutate(across(-1, ~ ./sum(.))) %>% 
  bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
   To         A         B         C
1   A 0.1428571 0.2500000 0.2777778
2   B 0.2857143 0.3333333 0.3333333
3   C 0.5714286 0.4166667 0.3888889
4 Sum 1.0000000 1.0000000 1.0000000

CodePudding user response:

in Base R you could do something like:

addmargins(prop.table(as.matrix(data.frame(data, row.names = 1)),2),1)

            A         B         C
A   0.1428571 0.2500000 0.2777778
B   0.2857143 0.3333333 0.3333333
C   0.5714286 0.4166667 0.3888889
Sum 1.0000000 1.0000000 1.0000000
  • Related