How to make a loop function output its result as a data frame in R-CodePudding

I created a function clique_function that selects variables from two data frames object_list and clique_list based on conditions and returns a new data frame from that clique_object.

Input:

clique_list is a list of cliques (groups of three nodes in a network) where each column represents a clique with three nodes composing it.

clique_list <- structure(c("ND1", "IS1", "IS3", "IS1", "IS3", "IS2", "ND2", 
"ND1", "IS1"), .Dim = c(3L, 3L))

object_list is a matrix with the nodes as rows and the occurrence of different object types as column.

object_list <- structure(list(CA1 = c(0.159159159159159, 0.222222222222222, 
0.25, 0.115384615384615, 0.311111111111111, 0.1285140562249, 
0.214132762312634, 0.413461538461538, 0.183333333333333, 0.4, 
0.4375, 0.167778836987607, 0.25, 0.5, 0.166666666666667, 0.181818181818182, 
0.21580547112462, 0.0792452830188679, 0.424657534246575, 0, 0
), CA11 = c(0.00600600600600601, 0, 0, 0, 0, 0.00401606425702811, 
0.012847965738758, 0, 0.05, 0, 0, 0, 0, 0, 0, 0, 0.00911854103343465, 
0.0113207547169811, 0.0410958904109589, 0, 0), CA111 = c(0, 0, 
0, 0, 0, 0, 0.00499643112062812, 0, 0.0333333333333333, 0, 0, 
0, 0, 0, 0, 0, 0.0060790273556231, 0.00754716981132075, 0.0273972602739726, 
0, 0), CA1111 = c(0, 0, 0, 0, 0, 0, 0.000713775874375446, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), CA1113 = c(0, 0, 0, 0, 
0, 0, 0.000713775874375446, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0), CA1115 = c(0, 0, 0, 0, 0, 0, 0.000713775874375446, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), CA1116 = c(0, 0, 0, 
0, 0, 0, 0, 0, 0.0166666666666667, 0, 0, 0, 0, 0, 0, 0, 0.00303951367781155, 
0.00377358490566038, 0, 0, 0), CA1117 = c(0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0136986301369863, 0, 0), CA112 = c(0, 
0, 0, 0, 0, 0, 0.00285510349750178, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0), CA1122 = c(0, 0, 0, 0, 0, 0, 0.00142755174875089, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), class = "data.frame", row.names = c("ND5", 
"ND6", "ND8/ND10", "ND3", "ND7", "ND2", "ND1", "ND4", "KB3/KB4/KB5", 
"KB1", "KB2", "IS1", "KB9", "KB7/KB8", "KB6", "IS4", "IS3", "IS2", 
"KB12/KB14", "KB13", "IS5"))

The function clique_function is supposed to loop around object_list and select the variable type (column) such as CA1 of the three nodes from clique_list.

 clique_function <- function(clique_list, type, object_list) 
  {
    for (i in 1:ncol(clique_list)) {
      clique_object <- subset(object_list, row.names(object_list) %in% clique_list[, i],
                              colnames(object_list) == type)
    }
    return(clique_object)
      }

Expected output clique_object is a subset of object_list showing the occurrence of a chosen object type type for all the cliques in clique_list.

For exemple:

clique_object <- structure(list(cliques = c("ND1", "IS1", "IS3", "ND2", "IS1", 
"IS3", "ND1", "IS4", "IS3", "ND1", "IS1", "IS3", "ND1", "IS1", 
"IS8", "ND3", "IS1", "IS3", "ND1", "IS1", "IS3"), CA1 = c(0.0007137759, 
0.0047664442, 0.009118541, 0.0007137759, 0.0047664442, 0.009118541, 
0.0007137759, 0.0047664442, 0.009118541, 0.0007137759, 0.0047664442, 
0.009118541, 0.0007137759, 0.0047664442, 0.009118541, 0.0007137759, 
0.0047664442, 0.009118541, 0.0007137759, 0.0047664442, 0.009118541
)), class = "data.frame", row.names = c(NA, -21L))

The function works properly if instead of return(clique_object) I put print(clique_object). In the first case I get the full list from the function looping around the data frames. But with return(clique_object) I only get the result for the first clique in clique_list.

I would like the function to output the full result as a data frame.

Thank you.

CodePudding user response：

If you change you function like this:

clique_function <- function(clique_list, type, object_list) 
{
  lapply(seq(1,ncol(clique_list)), function(i) {
    subset(object_list, row.names(object_list) %in% clique_list[, i],colnames(object_list) == type)
  })
}

then it will return a list of dataframes like this:

[[1]]
          CA1
ND1 0.2141328
IS1 0.1677788
IS3 0.2158055

[[2]]
           CA1
IS1 0.16777884
IS3 0.21580547
IS2 0.07924528

[[3]]
          CA1
ND2 0.1285141
ND1 0.2141328
IS1 0.1677788

You can then choose how you want to combine these frames. For example, you could combine them like this:

bind_rows(lapply(seq_along(res), function(x) tibble("clique"=x, "nodes"=rownames(res[[x]]), res[[x]])))
# A tibble: 9 x 3
  clique nodes    CA1
   <int> <chr>  <dbl>
1      1 ND1   0.214 
2      1 IS1   0.168 
3      1 IS3   0.216 
4      2 IS1   0.168 
5      2 IS3   0.216 
6      2 IS2   0.0792
7      3 ND2   0.129 
8      3 ND1   0.214 
9      3 IS1   0.168

but I don't know what your desired output structure is. In actual practice, I would further adjust the clique_function to return the desired final structure in one call.

CodePudding user response：

You could fill a clique_object list and use rbind to put the results together in a data.frame:

clique_function <- function(clique_list, type, object_list) 
{ clique_object <- list()
  for (i in 1:ncol(clique_list)) {
    clique_object[[i]] <- subset(object_list, row.names(object_list) %in% clique_list[, i],
                            colnames(object_list) == type)
  }
  return(do.call(rbind,clique_object))
}

clique_function(clique_list,"CA1",object_list)
            CA1
ND1  0.21413276
IS1  0.16777884
IS3  0.21580547
IS11 0.16777884
IS31 0.21580547
IS2  0.07924528
ND2  0.12851406
ND11 0.21413276
IS12 0.16777884