I have a data structure as below. As can be seen there are some columns that have numbers separated by a colon. For these, I'd like to retain only the maximum value. Example, for record 1 the expected output is 15 and for record 5 it should be 142. I considered using split function, however, this will create additional columns which I wouldn't want as I need to retain the column structure.
dat <- structure(list(X1 = list(1:15), X2 = list(106L), X3 = list(134L),
X4 = list(139L), X5 = list(141:142)), class = "data.frame", row.names = c(NA,
-1L))
Expected output
X1 X2 X3 X4 X5
15 106 134 139 142
CodePudding user response:
You may use
sapply(dat, function (x) max(unlist(x)))
# X1 X2 X3 X4 X5
# 15 106 134 139 142
sapply
returns a named vector in this case. If you want a data frame, we can do
data.frame(lapply(dat, function (x) max(unlist(x))))
# X1 X2 X3 X4 X5
#1 15 106 134 139 142
The printing style of a named vector and a 1-row data frame are quite similar, aren't they.
Although this question has been solved, I would like to point out that your dat
is not arranged in efficient storage. It is quite uncommon for a data frame column to be a list. Using a list of vectors is more convenient for subsequent operations.
lst <- list(X1 = 1:15, X2 = 106L, X3 = 134L, X4 = 139L, X5 = 141:142)
sapply(lst, max)
# X1 X2 X3 X4 X5
# 15 106 134 139 142
data.frame(lapply(lst, max))
# X1 X2 X3 X4 X5
#1 15 106 134 139 142
CodePudding user response:
With tidyverse
library(tidyverse)
df %>%
rowwise() %>%
summarise(across(everything(), max))
# A tibble: 1 × 5
X1 X2 X3 X4 X5
<int> <int> <int> <int> <int>
1 15 106 134 139 142