Home > Software design >  R tidy pipe within a function to be used with mutate
R tidy pipe within a function to be used with mutate

Time:08-31

I am trying to write a function that would extract surroundings of a keyword. If there are more instances of the keyword, surrounding of each would be combined in final output. Current version works well over a single string with 2 keyword instances :) However, does NOT work when used within tidy pipe with mutate. I tried to write an easy function "first_letter" to test that mutate operates over single string instead of concatenates the whole column into a single character vector and it works well.

    submarine <- 'We all live in a yellow submarine'
    yesterday <- 'Yesterday all my troubles seemed so far away, all of them'
    my_data <- data.frame(text=c(submarine, yesterday))
    pat <- "all"
        
    first_letter <- function(x){
      fl_res <- substr(x,1,1)
      return(fl_res)
    }
    
    my_data_fl <- my_data %>% dplyr::mutate(first=first_letter(text))

# Target function that works with string but not within mutate
# loc is a data.frame generated within the function
# pat is the keyword
# I tried to replace x with .data, but it does not help

term_surr <- function(x, pat, before_char=5, after_char=15){
  loc <- x %>% 
    stringr::str_locate_all(pat) %>% 
    as.data.frame() %>%
    tibble::as_tibble  %>% 
    dplyr::mutate(from=start - before_char) %>% 
    dplyr::mutate(to=end   after_char) %>% 
    dplyr::select(from, to) %>%
    dplyr::mutate(surr = stringr::str_sub(x, .data$from, .data$to))
  
  res_txt <- purrr::map_chr(loc$surr, ~paste(.x, sep = ". ")) %>% stringi::stri_paste(collapse=' ... ')
  return(res_txt)
}

# FUNCTIONAL with text input as string
# surr <- term_surr(yesterday, pat=pat)
  
# NOT FUNCTIONAL with dataframe column
# my_data_surr <- my_data %>% mutate(surr= term_surr(text, pat=pat))

If there is a tutorial on using tidy/dplyr pipes within function, please share a link with me. I would be happy for any suggestion about the code above.

CodePudding user response:

The complete code would be:

library(tidyverse)
submarine <- 'We all live in a yellow submarine'
yesterday <- 'Yesterday all my troubles seemed so far away, all of them'
my_data <- data.frame(text=c(submarine, yesterday))
pat <- "all"res <- my_data %>%
  mutate(surr = str_locate_all(text, pat) %>% 
  map(~ as.data.frame(.x) %>% 
  transmute(from = start - 2, to = end   15))) %>% 
  unnest(surr, names_sep = ".") %>% 
  mutate(surr = str_sub(text,  surr.from, surr.to)) %>% 
  group_by(text) %>% 
  summarise(surr = str_c(surr, collapse = " ... ")) %>% 
  pull(surr) %>% 
  bind_cols(., my_data) %>%
  rename_all(funs(c("surr", "text"))) %>% 
  select(2,1)
  • Related