I am new to R and therefore sorry, if the awnser is obvious. I am trying to perform operations on tibbles and their values/columns while this tibbles are part of a list. Previously I would upload each of the now tibbles manually as a data.frame (csv data) and perform the operations manually on the data.frame. Unfortunately this is tiresome, so I am trying to get all the operations I have in my script done for all my data.frames at the same time. For example, what worked so far for me was to add 0.7 to every element in every column by the name 'Temperature' in each tibble on the list. I did it like that:
for(i in seq_along(Data_List)) {Data_List[[i]]$Temperature <- Data_List[[i]]$Temperature 0.7}
However I now would like to perform different tasks: primarily I need to divide my tibbles into sequences. When I worked with the one data.frame at a time, this is what I did:
df_Sitting <- df[1:12, ]
df_Standing <- df[13:26, ]
df_LigEx <- df[27:35, ]
df_VigEx <- df[36:42, ]
df_After <- df[43:54, ]
How do I adjust it properly for the list of all my tibbles/data.frames I now have? Secondly, I want to perform descriptive statistics, Pearson Correlation and Lin Correlation. Additionally I created a ggplot and a Bland-Altman-Plot. I did it like this:
describe(df$Temperature)
describe(df$Temp_core)
cor.test(df)
library(epiR)
epi.ccc(df$Temp_core, df$Temperature, ci = "z-transform",
conf.level = 0.95, rep.measure = FALSE, subjectid)
mdata <- melt(df, id="Time")
ggplot(data = mdata, aes(x = Time, y = value))
geom_point(aes(group= variable, color = variable))
geom_line(aes(group= variable, color = variable))
library(BlandAltmanLeh)
BlandAltman_df <- bland.altman.plot(df$Temp_core, df$Temperature, graph.sys = "ggplot2")
print(BlandAltman_df theme(plot.title=element_text(hjust = 0.5)))
I want now to run all the functions above for the entire list of tibbles and variables within the tibbles at once and get all the corresponding Statistics and Plots, to later create a Markdown. I tried lapply but it somehow does not work. I hope I formulated the question correctly, I appreciate the help!!
CodePudding user response:
Working with a list of some other types is totally doable in R. Firstly, I suggest replacing seq_along
with lapply
, or since you are already using tidyverse, purrr::map
:
for(i in seq_along(Data_List)) {
Data_List[[i]]$Temperature <- Data_List[[i]]$Temperature 0.7
}
becomes:
modified_data_list <- purrr::map(Data_List, function(df){
dplyr::mutate(df, Temperature = Temperature 0.7)
})
You can apply this same principle for your above function. Note that I use purrr:walk
here instead of map
, because you aren't returning a modified data frame in your function, you are instead calling it for "side effects" like the plot:
library(epiR)
library(BlandAltmanLeh)
modified_data_list <- purrr::walk(Data_List, function(df){
describe(df$Temperature)
describe(df$Temp_core)
cor.test(df)
epi.ccc(df$Temp_core, df$Temperature, ci = "z-transform",
conf.level = 0.95, rep.measure = FALSE, subjectid)
mdata <- melt(df, id="Time")
ggplot(data = mdata, aes(x = Time, y = value))
geom_point(aes(group= variable, color = variable))
geom_line(aes(group= variable, color = variable))
BlandAltman_df <- bland.altman.plot(df$Temp_core, df$Temperature, graph.sys = "ggplot2")
print(BlandAltman_df theme(plot.title=element_text(hjust = 0.5)))
})
CodePudding user response:
You can lapply
the tests and plot code to the list members and return lists of tests results and plots. Something like the following.
library(ggplot2)
library(epiR)
library(BlandAltmanLeh)
Data_List <- lapply(Data_List, \(X){
X[["Temperature"]] <- X[["Temperature"]] 0.7
X
})
cor_test_list <- lapply(Data_List, \(X) cor.test(formula = ~ Temperature Temp_core, data = X))
lin_test_list <- lapply(Data_List, \(X){
epi.ccc(
X[["Temp_core"]],
X[["Temperature"]],
ci = "z-transform",
conf.level = 0.95,
rep.measure = FALSE
)
})
gg_plot_list <- lapply(Data_List, \(X){
mdata <- reshape2::melt(X, id = "Time")
ggplot(data = mdata, aes(x = Time, y = value))
geom_point(aes(group = variable, color = variable))
geom_line(aes(group= variable, color = variable))
})
BlandAltman_List <- lapply(Data_List, \(X){
BlandAltman_df <- bland.altman.plot(X$Temp_core, X$Temperature, graph.sys = "ggplot2")
BlandAltman_df
theme(plot.title = element_text(hjust = 0.5))
})
The tests
To access the test results, use once again *apply
loops together with extraction functions.
sapply(cor_test_list, "[[", "estimate")
# df_a.cor df_b.cor df_c.cor
#0.7425467 0.5259107 0.4572278
sapply(cor_test_list, "[[", "statistic")
# df_a.t df_b.t df_c.t
#7.680738 4.283887 3.561892
sapply(cor_test_list, "[[", "p.value")
# df_a df_b df_c
#6.709843e-10 8.771860e-05 8.434625e-04
sapply(lin_test_list, "[[", "rho.c")
sapply(lin_test_list, "[[", "sblalt")
The plots
The plots can be plotted one by one:
gg_plot_list[[1]]
BlandAltman_List[[1]]
or in a loop with print
.
for(i in seq_along(gg_plot_list))
print(gg_plot_list[[i]])
Or to a graphics device (to disk file).
for(i in seq_along(gg_plot_list)) {
filename <- sprintf("Rplotd.png", i)
png(filename = filename)
print(gg_plot_list[[i]])
dev.off()
}
Test data
Data_List <- iris[1:2]
names(Data_List) <- c("Temp_core", "Temperature")
Data_List$Time <- rep(1:50, 3)
Data_List <- split(Data_List, iris$Species)
names(Data_List) <- paste("df", letters[1:3], sep = "_")
Data_List <- lapply(Data_List, \(x){row.names(x) <- NULL; x})