I am trying to get both the sum across rows and the max value in a row. I obviously do not want the rowsum column to be included in the max values, nor do i want the max values included in the row sum. I need a final dataset that has both of these columns retained however.
Using dplyr
I tried-
iris<- iris %>%
mutate(readsum = rowSums(across(where(is.numeric)), na.rm=TRUE))
iris_max<- iris %>%
rowwise()%>%
select(-"readsum")%>%
mutate(readmax = max(across(where(is.numeric)), na.rm=TRUE))
but this just removed readsum
from the new df
I would like to get as output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species readsum readmax
<dbl> <dbl> <dbl> <dbl> <fct> <dbl> <dbl>
1 5.1 3.5 1.4 0.2 setosa 10.2 5.1
2 4.9 3 1.4 0.2 setosa 9.5 4.9
3 4.7 3.2 1.3 0.2 setosa 9.4 4.7
4 4.6 3.1 1.5 0.2 setosa 9.4 4.6
5 5 3.6 1.4 0.2 setosa 10.2 5
6 5.4 3.9 1.7 0.4 setosa 11.4 5.4
CodePudding user response:
Use c_across
and wrap where
around is.numeric
.
A way to keep the new column readsum
in the final result is to first create an index to the columns that already are numeric. Then create readsum
.
suppressPackageStartupMessages(
library(dplyr)
)
data(iris, package = "datasets")
i_num <- iris %>%
sapply(is.numeric) %>%
which()
iris <- iris %>%
mutate(readsum = rowSums(across(where(is.numeric)), na.rm=TRUE))
head(iris)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species readsum
#> 1 5.1 3.5 1.4 0.2 setosa 10.2
#> 2 4.9 3.0 1.4 0.2 setosa 9.5
#> 3 4.7 3.2 1.3 0.2 setosa 9.4
#> 4 4.6 3.1 1.5 0.2 setosa 9.4
#> 5 5.0 3.6 1.4 0.2 setosa 10.2
#> 6 5.4 3.9 1.7 0.4 setosa 11.4
iris %>%
rowwise() %>%
mutate(readmax = max(c_across(all_of(i_num))))
#> # A tibble: 150 × 7
#> # Rowwise:
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species readsum readmax
#> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2 setosa 10.2 5.1
#> 2 4.9 3 1.4 0.2 setosa 9.5 4.9
#> 3 4.7 3.2 1.3 0.2 setosa 9.4 4.7
#> 4 4.6 3.1 1.5 0.2 setosa 9.4 4.6
#> 5 5 3.6 1.4 0.2 setosa 10.2 5
#> 6 5.4 3.9 1.7 0.4 setosa 11.4 5.4
#> 7 4.6 3.4 1.4 0.3 setosa 9.7 4.6
#> 8 5 3.4 1.5 0.2 setosa 10.1 5
#> 9 4.4 2.9 1.4 0.2 setosa 8.9 4.4
#> 10 4.9 3.1 1.5 0.1 setosa 9.6 4.9
#> # … with 140 more rows
Created on 2022-12-19 with reprex v2.0.2
CodePudding user response:
Here is another approach using reduce
from purrr
and pmax
, using select
to only include columns from the original data.frame:
library(tidyverse)
iris %>%
mutate(readsum = rowSums(across(where(is.numeric)), na.rm = T),
readmax = select(., where(is.numeric)) %>% reduce(pmax, na.rm = T))
Output
Sepal.Length Sepal.Width Petal.Length Petal.Width Species readsum readmax
1 5.1 3.5 1.4 0.2 setosa 10.2 5.1
2 4.9 3.0 1.4 0.2 setosa 9.5 4.9
3 4.7 3.2 1.3 0.2 setosa 9.4 4.7
4 4.6 3.1 1.5 0.2 setosa 9.4 4.6
5 5.0 3.6 1.4 0.2 setosa 10.2 5.0
6 5.4 3.9 1.7 0.4 setosa 11.4 5.4
7 4.6 3.4 1.4 0.3 setosa 9.7 4.6
8 5.0 3.4 1.5 0.2 setosa 10.1 5.0
...