Home > Back-end >  Piping over a list, subsetting and calculate a function of my own
Piping over a list, subsetting and calculate a function of my own

Time:11-14

I have a dataset with these three columns and other additional columns

structure(list(from = c(1, 8, 3, 3, 8, 1, 4, 5, 8, 3, 1, 8, 4, 
1), to = c(8, 3, 8, 54, 3, 4, 1, 6, 7, 1, 4, 3, 8, 8), time = c(1521823032, 
1521827196, 1521827196, 1522678358, 1522701516, 1522701993, 1522702123, 
1522769399, 1522780956, 1522794468, 1522794468, 1522794468, 1522794468, 
1522859524)), class = "data.frame", row.names = c(NA, -14L))

I need the code to take all indices less than a number (e.g. 5) and for each of them do the following: Subset the data set if the index is either in column "from" or in column "to" and calculate a function (e.g the difference between the min and max in time). As a result I expect a dataframe with the indexes and the results of the calculation.

This is what I have, but it does not work.

dur<-function(x)max(x)-min(x)  #The function to calculate the difference. In other cases I need to use other functions of my own

filternumber <- function(number,x){          #A function to filter data x by the number in the two two columns
  x <- x%>% subset(from == number | to == number)
  return(x)
}

lista <- unique(c(data$from, data$to))  # Creates a list with all the indexes in the data. I do this to avoid having non-existing indexes
lista <-lista[lista <= 5]  #Limit the list to 5. In my code this number would be an argument to a function

result<-lista%>%filteremployee(.,data) %>% select(time) %>% dur() #I use select because I have many other columns in the data

The result in this case should be a dataframe with 1036492 for 1, 967272 for 3 and 92475 for 4

I´ve also try putting filteremployee(.,data) %>% select(time) %>% dur() in side mutate but that does not work either

CodePudding user response:

Perhaps you are looking for something like this:

library(purrr)
library(dplyr)

index <- c(1, 3, 4)
names(index) <- index

index %>% 
  map_dfr(~ df %>% 
        filter(from == .x | to == .x) %>% 
        summarize(result = dur(time)),
        .id = "index")

This returns

  index  result
1     1 1036492
2     3  967272
3     4   92475

CodePudding user response:

The function was created with ==, which is elementwise. Here, we may need to loop

library(dplyr)
library(purrr)
map_dbl(lista, ~ filternumber(.x, data) %>%
      select(time) %>%
       dur)
[1] 1036492  967272   92475       0
  • Related