Home > Mobile >  Using a dataframe to input values to a function and aggregating the outputs
Using a dataframe to input values to a function and aggregating the outputs

Time:11-01

I'm very new to R and completely lost.

I've have a function that will aggregate my data, but I can't work out how to add the results to some new dataframe and how to loop an input to get the results?

My input looks like this

input = structure(list(V1 = c("Team_2022", "Team_2022", "Team_2022"), V2 = c("Frank", "Mary", "John"), V3 = c("Sydney", "Sydney", "Sydney"), V4 = c(55, 76, 14)), row.names = c(NA, -3L), class = c("data.table", "data.frame"))

And the function I am using is:

result = function(data, combination)
{
    purrr::map_df(lapply(combination,function(x){paste(x,collapse = '|')}), function(x){
    df = data[grepl(x,data$V2),] %>% group_by(V3) %>% summarize(V1= paste(V1[1]), V2= paste(V2,collapse = ' '), V4= sum(V4))
    data = df[,c(2,3,1,4)]
    return(data)
  }
  )
}

If use this function one at a time (see below) and then bind the results, I can get my desired output:

team1 = list(c("Mary","Frank"))
team2 = list(c("Mary", "John"))

output1 = result(input, team1)
output2 = result(input, team2)
output =  rbind(output1, output2)

But I'd rather use a loop or feed into my function something like the following :

teams = structure(list(V1 = c("team1", "team2"), V2 = c("Mary   Frank","Mary   John")), class = "data.frame", row.names = c(NA, -2L))

and end up with the same:

output = structure(list(V1 = c("Team_2022", "Team_2022"), V2 = c("Frank Mary", 
"Mary John"), V3 = c("Sydney", "Sydney"), V4 = c(131, 90)), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame"))

Can anyone suggest how I can create a loop that could do this (and whether it would be easier to change the structure of the input variable to do this)?

CodePudding user response:

You may try

apply(teams, 1, function(x) {
  y <- list(str_split(x[2], ' \\  ', simplify = T))
  res_fun(input, y)
}) %>% do.call("rbind", .)

  V1        V2         V3        V4
  <chr>     <chr>      <chr>  <dbl>
1 Team_2022 Frank Mary Sydney   131
2 Team_2022 Mary John  Sydney    90
  • Related