lets say I have this data frame
what I want to do is divide the numbers in the columns by the total number in the last row of each column, I can not, I thought I would do it like this but I get the following error: undefined columns selected.
df[4] %>% mutate_if(is.numeric, ~ . / total)
In the data i am working on I have 454 columns so specifying them all would be impossible.
CodePudding user response:
I advise against including margin totals in raw data. As you found out, it makes things unnecessarily complicated.
That aside, here is an option
df %>%
mutate(across(b:c, ~ replace(.x, a != "total", .x[a != "total"] / last(.x))))
# a b c
#1 1a 0.4285714 0.25
#2 2a 0.8571429 0.50
#3 3a 0.4285714 0.75
#4 total 7.0000000 8.00
This assumes that totals are always in the last row (i.e. the total is the last entry in a column vector).
You can replace across(b:c, ...)
with across(where(is.numeric), ...)
if preferable.
Sample data
df <-read.table(text = " a b c
1 1a 3 2
2 2a 6 4
3 3a 3 6
4 total 7 8", header = T)
CodePudding user response:
Does this work:
df %>% mutate(across(b:c, ~ case_when(a != 'total' ~ ./.[a=='total'],
TRUE ~ .)))
a b c
1 1a 0.4285714 0.25
2 2a 0.8571429 0.50
3 3a 0.2857143 0.75
4 total 7.0000000 8.00