Home > Mobile >  Filter a dataframe by passing "type" as input to columns in R
Filter a dataframe by passing "type" as input to columns in R

Time:10-07

I have a dataframe like this

    id <- c(5738180,51845,167774,517814,1344920,517833,51844)
    measurement <- c("Length","Breadth","Breadth","Length","Length","Length","Breadth")
    map_flag <- c(0,1,1,0,0,0,0)
    cap_flag <- c(1,0,0,1,1,0,0)
    df.sample <- data.frame(id,measurement,map_flag,cap_flag) 
    

df.sample

       id measurement map_flag cap_flag
  5738180      Length        0        1
    51845     Breadth        1        0
   167774     Breadth        1        0
   517814      Length        0        1
  1344920      Length        0        1
   517833      Length        0        0
    51844     Breadth        0        0

I am trying to create a function that takes a df and type as inputs and returns a df based on the filtered type

If I pass in

  • type = "map", it should filter the df using map_flag = 1 column and return a df
  • type = "cap", it should filter the df using cap_flag = 1 column and return a df
  • type = "Length", it should filter the df using measurement = Length column and return a df

Desired outputs

type = "map"

       id measurement map_flag cap_flag
    51845     Breadth        1        0
   167774     Breadth        1        0

type = "cap"

    id measurement map_flag cap_flag
  5738180      Length        0        1
   517814      Length        0        1
  1344920      Length        0        1

type = "Length"

       id measurement map_flag cap_flag
  5738180      Length        0        1
   517814      Length        0        1
  1344920      Length        0        1
   517833      Length        0        0

I am trying to do it this way but not getting what I wanted

 testFun <- function (df, type) {
      df <- df %>%
        {if (type==starts_with(map) filter(map_flag==1) }
      return(df)
    }

I'd really appreciate it if someone can point me in the right direction.

CodePudding user response:

If there are only 3 possible type inputs I would suggest to go with @Park's answer since it is easy to understand and does what is required.

I provide another option where you don't need to list down conditions for every input. It checks if type_flag column is present in the data and returns output based on that condition.

library(dplyr)

testFun <- function (df, type) {
  col <- paste0(type, '_flag')
  if(col %in% colnames(df)) {
    res <- df %>% filter(.data[[col]] == 1)
  } else {
    res <- df %>% filter(measurement == type)
  }
  return(res)
}

testFun(df.sample, "map")
#      id measurement map_flag cap_flag
#1  51845     Breadth        1        0
#2 167774     Breadth        1        0

testFun(df.sample, "cap")
#       id measurement map_flag cap_flag
#1 5738180      Length        0        1
#2  517814      Length        0        1
#3 1344920      Length        0        1

testFun(df.sample, "Length")
#       id measurement map_flag cap_flag
#1 5738180      Length        0        1
#2  517814      Length        0        1
#3 1344920      Length        0        1
#4  517833      Length        0        0

CodePudding user response:

You may try

func <- function(df, type = c("map", "cap", "Length")){
  if (type == "map"){
    df %>% filter(map_flag == 1)
  } else if (type == "cap") {
    df %>% filter(cap_flag == 1)
  } else if (type == "Length") {
    df %>% filter(measurement == "Length")
  } else {
    stop("select type amont map, cap, and Length")
  }
  
  
  
}

func(df.sample, "map")
      id measurement map_flag cap_flag
1  51845     Breadth        1        0
2 167774     Breadth        1        0
func(df.sample, "cap")
1 5738180      Length        0        1
2  517814      Length        0        1
3 1344920      Length        0        1
func(df.sample, "Length")
1 5738180      Length        0        1
2  517814      Length        0        1
3 1344920      Length        0        1
4  517833      Length        0        0

Error message is like

func(df.sample, "ma")
Error in func(df.sample, "ma") : select type amont map, cap, and Length

New one

func <- function(df, type = c("map", "cap", "Length")){
  key <- paste0(type, "_flag")
  res1 <- try(df %>% filter(!!as.symbol(key) ==1), silent = TRUE)
  res2 <- df %>% filter(measurement == "Length")
  if(is(res1, "try-error")) res2 else res1

  
}

CodePudding user response:

This solution only needs base R and works with data.frame or data.table input:

testFun <- function(df, type) {
    
    key <- sprintf('%s_flag', type)

    if(key %in% names(df)) {
        return(df[df[[key]] == 1,])
    } else {
        # For other `type`s fall back to this - I think this is bad!
        return(df[df$measurement == 'Length',])    
    }

}

However, from a design point of view I would much prefer this:

testFun <- function(df, type) {
    if(type == 'map') {
        return(df[df$map_flag == 1,])
    }
    if(type == 'cap') {
        return(df[df$cap_flag == 1,])
    }
    if(type == 'Length') {
        return(df[df$measurement == 'Length',])
    }
    stop(sprintf('Invald type: %s', type))
}

or to require type to be a column in the input data.frame.

  • Related