I have this df:
library(tidyverse)
library(magrittr)
df <- tibble(
Time = c('June 7', 'June 8', 'June 9', 'June 10', 'June 11', 'June 12', 'June 13', 'June 14', 'June 15', 'June 16', 'June 17', 'June 18', 'June 19', 'June 20', 'June 21', 'June 22', 'June 23', 'June 24', 'June 25', 'June 26', 'June 27', 'June 28'),
Measurements = c('105, 54, 79, 49, 31, 84, 55', '50, 105, 85, 72, 27, 43', '58, 26, 38', '67, 52, 92, 46', '73, 59, 62', '57, 24', '78, 96, 107', '76, 49, 40, 34, 44, 55', '18, 60, 39', '39, 55, 35', '86, 27, 91, 49, 23, 65, 32, 74', '32, 47, 57', '70, 56', '146, 39', '94, 39, 21, 72, 55', '48, 70, 10, 160', '126, 87, 107, 45, 55, 39', '33, 62, 38', '43, 63, 68, 21, 126, 87, 107', '56, 86, 64', '66, 55', '34, 44, 55, 72, 51, 42')
)
I want to split the values in Measurements
by commas and calculate the mean for each row (rowwise
)
I was able to split and convert to numeric:
df %>% lapply(str_split(.$Measurements, ', '), as.numeric)
But didn't know how to proceed from here. Any help is appreicated!
Instead of lapply
, can I use purrr::map
here instead?
CodePudding user response:
This is a possibile approach:
df %>%
mutate(Time = factor(Time, levels = Time)) %>%
separate(Measurements, sep = ",", into = letters[seq(1, 10)]) %>%
pivot_longer(a:j) %>%
na.omit() %>%
mutate(value = as.numeric(value)) %>%
group_by(Time) %>%
summarise(mean = mean(value))
# A tibble: 22 × 2
Time mean
<fct> <dbl>
1 June 7 65.3
2 June 8 63.7
3 June 9 40.7
4 June 10 64.2
5 June 11 64.7
6 June 12 40.5
7 June 13 93.7
8 June 14 49.7
9 June 15 39
10 June 16 43
# … with 12 more rows
CodePudding user response:
A hacky solution...
df <- str_split(df$Measurements, ', ')
means <- NULL
row <- NULL
for (i in seq_along(df)){
row <- as.numeric(str_split(df[[i]], ', '))
means[i] <- mean(row)
}
CodePudding user response:
I think you are looking for something like this:
library(tidyverse)
library(stringr)
# to pass data to lapply your way needs '{}'
# use unnamed function \(x) = shorthand for function(x)
df$Measurements <- df %>%
{lapply(str_split(.$Measurements, ', '), \(x) x %>%
as.numeric() %>%
mean())} %>%
do.call(rbind, .)
# A tibble: 22 × 2
Time Measurements[,1]
<chr> <dbl>
1 June 7 65.3
2 June 8 63.7
3 June 9 40.7
4 June 10 64.2
5 June 11 64.7
6 June 12 40.5
7 June 13 93.7
8 June 14 49.7
9 June 15 39
10 June 16 43
# … with 12 more rows