I have a dataframe containing three columns, two of which can contain either numeric values or lists. I would like to add additional columns containing the min / max values of each of these two columns. For example, my data frame may look like;
df <- structure(list(ID = c(1L, 2L, 3L), A = structure(list(
5, c(0.5, 0.6), 2), names = c("", "", "")), B = structure(list(
c(0.2, 0.3), 6, c(0.1, 0.1)), names = c("", "", ""))), row.names = c(NA,
3L), class = "data.frame")
I would like to mutate this to add the columns;
ID | A | B | min_A | max_A | min_B | max_B |
---|---|---|---|---|---|---|
1 | 5 | 0.2, 0.3 | 5 | 5 | 0.2 | 0.3 |
2 | 0.5, 0.6 | 6 | 0.5 | 0.6 | 6 | 6 |
3 | 2 | 0.1, 0.1 | 2 | 2 | 0.1 | 0.1 |
I have tried mutate(min_A = min(unlist(A)))
, but this seems to take the minimum value of the entire column of A rather than just the list on any given row. mutate(min_A = min(A))
errors out because list is an invalid argument type for the min
command. So how might I go about adding the data I'm after?
CodePudding user response:
You should able to get the answer by adding rowwise()
. I also used across()
in my answer but that part isn't 100% necessary, just a little more efficient:
library(tidyverse)
df %>%
rowwise() %>%
mutate(across(A:B, function(x) min(unlist(x)), .names = "min_{.col}")) %>%
mutate(across(A:B, function(x) max(unlist(x)), .names = "max_{.col}"))
# A tibble: 3 × 7
# Rowwise:
ID A B min_A min_B max_A max_B
<dbl> <list> <list> <dbl> <dbl> <dbl> <dbl>
1 1 <dbl [1]> <dbl [2]> 5 0.2 5 0.3
2 2 <dbl [2]> <dbl [1]> 0.5 6 0.6 6
3 3 <dbl [1]> <dbl [2]> 2 0.1 2 0.1
CodePudding user response:
Base R with a loop:
cols <- c("A", "B")
for(col in cols){
df[,paste0("min_", col)] <- sapply(df[,col], function(x) min(unlist(x)))
df[,paste0("max_", col)] <- sapply(df[,col], function(x) max(unlist(x)))
}
CodePudding user response:
Using map
with across
library(purrr)
library(dplyr)
df %>%
mutate(across(A:B, ~map_dbl(.x, min), .names = 'min_{.col}'),
across(A:B, ~ map_dbl(.x, max), .names = 'max_{.col}'))
-output
ID A B min_A min_B max_A max_B
1 1 5 0.2, 0.3 5.0 0.2 5.0 0.3
2 2 0.5, 0.6 6 0.5 6.0 0.6 6.0
3 3 2 0.1, 0.1 2.0 0.1 2.0 0.1