I created a function clique_function
that selects variables from two data frames object_list
and clique_list
based on conditions and returns a new data frame from that clique_object
.
Input:
clique_list
is a list of cliques (groups of three nodes in a network) where each column represents a clique with three nodes composing it.
clique_list <- structure(c("ND1", "IS1", "IS3", "IS1", "IS3", "IS2", "ND2",
"ND1", "IS1"), .Dim = c(3L, 3L))
object_list
is a matrix with the nodes as rows and the occurrence of different object types as column.
object_list <- structure(list(CA1 = c(0.159159159159159, 0.222222222222222,
0.25, 0.115384615384615, 0.311111111111111, 0.1285140562249,
0.214132762312634, 0.413461538461538, 0.183333333333333, 0.4,
0.4375, 0.167778836987607, 0.25, 0.5, 0.166666666666667, 0.181818181818182,
0.21580547112462, 0.0792452830188679, 0.424657534246575, 0, 0
), CA11 = c(0.00600600600600601, 0, 0, 0, 0, 0.00401606425702811,
0.012847965738758, 0, 0.05, 0, 0, 0, 0, 0, 0, 0, 0.00911854103343465,
0.0113207547169811, 0.0410958904109589, 0, 0), CA111 = c(0, 0,
0, 0, 0, 0, 0.00499643112062812, 0, 0.0333333333333333, 0, 0,
0, 0, 0, 0, 0, 0.0060790273556231, 0.00754716981132075, 0.0273972602739726,
0, 0), CA1111 = c(0, 0, 0, 0, 0, 0, 0.000713775874375446, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), CA1113 = c(0, 0, 0, 0,
0, 0, 0.000713775874375446, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0), CA1115 = c(0, 0, 0, 0, 0, 0, 0.000713775874375446,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), CA1116 = c(0, 0, 0,
0, 0, 0, 0, 0, 0.0166666666666667, 0, 0, 0, 0, 0, 0, 0, 0.00303951367781155,
0.00377358490566038, 0, 0, 0), CA1117 = c(0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0136986301369863, 0, 0), CA112 = c(0,
0, 0, 0, 0, 0, 0.00285510349750178, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0), CA1122 = c(0, 0, 0, 0, 0, 0, 0.00142755174875089,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), class = "data.frame", row.names = c("ND5",
"ND6", "ND8/ND10", "ND3", "ND7", "ND2", "ND1", "ND4", "KB3/KB4/KB5",
"KB1", "KB2", "IS1", "KB9", "KB7/KB8", "KB6", "IS4", "IS3", "IS2",
"KB12/KB14", "KB13", "IS5"))
The function clique_function
is supposed to loop around object_list
and select the variable type
(column) such as CA1
of the three nodes from clique_list
.
clique_function <- function(clique_list, type, object_list)
{
for (i in 1:ncol(clique_list)) {
clique_object <- subset(object_list, row.names(object_list) %in% clique_list[, i],
colnames(object_list) == type)
}
return(clique_object)
}
Expected output clique_object
is a subset of object_list
showing the occurrence of a chosen object type type
for all the cliques in clique_list
.
For exemple:
clique_object <- structure(list(cliques = c("ND1", "IS1", "IS3", "ND2", "IS1",
"IS3", "ND1", "IS4", "IS3", "ND1", "IS1", "IS3", "ND1", "IS1",
"IS8", "ND3", "IS1", "IS3", "ND1", "IS1", "IS3"), CA1 = c(0.0007137759,
0.0047664442, 0.009118541, 0.0007137759, 0.0047664442, 0.009118541,
0.0007137759, 0.0047664442, 0.009118541, 0.0007137759, 0.0047664442,
0.009118541, 0.0007137759, 0.0047664442, 0.009118541, 0.0007137759,
0.0047664442, 0.009118541, 0.0007137759, 0.0047664442, 0.009118541
)), class = "data.frame", row.names = c(NA, -21L))
The function works properly if instead of return(clique_object)
I put print(clique_object)
. In the first case I get the full list from the function looping around the data frames. But with return(clique_object)
I only get the result for the first clique in clique_list
.
I would like the function to output the full result as a data frame.
Thank you.
CodePudding user response:
If you change you function like this:
clique_function <- function(clique_list, type, object_list)
{
lapply(seq(1,ncol(clique_list)), function(i) {
subset(object_list, row.names(object_list) %in% clique_list[, i],colnames(object_list) == type)
})
}
then it will return a list of dataframes like this:
[[1]]
CA1
ND1 0.2141328
IS1 0.1677788
IS3 0.2158055
[[2]]
CA1
IS1 0.16777884
IS3 0.21580547
IS2 0.07924528
[[3]]
CA1
ND2 0.1285141
ND1 0.2141328
IS1 0.1677788
You can then choose how you want to combine these frames. For example, you could combine them like this:
bind_rows(lapply(seq_along(res), function(x) tibble("clique"=x, "nodes"=rownames(res[[x]]), res[[x]])))
# A tibble: 9 x 3
clique nodes CA1
<int> <chr> <dbl>
1 1 ND1 0.214
2 1 IS1 0.168
3 1 IS3 0.216
4 2 IS1 0.168
5 2 IS3 0.216
6 2 IS2 0.0792
7 3 ND2 0.129
8 3 ND1 0.214
9 3 IS1 0.168
but I don't know what your desired output structure is. In actual practice, I would further adjust the clique_function
to return the desired final structure in one call.
CodePudding user response:
You could fill a clique_object
list and use rbind
to put the results together in a data.frame
:
clique_function <- function(clique_list, type, object_list)
{ clique_object <- list()
for (i in 1:ncol(clique_list)) {
clique_object[[i]] <- subset(object_list, row.names(object_list) %in% clique_list[, i],
colnames(object_list) == type)
}
return(do.call(rbind,clique_object))
}
clique_function(clique_list,"CA1",object_list)
CA1
ND1 0.21413276
IS1 0.16777884
IS3 0.21580547
IS11 0.16777884
IS31 0.21580547
IS2 0.07924528
ND2 0.12851406
ND11 0.21413276
IS12 0.16777884