Home > Software engineering >  R - Passing unquoted alpha-numeric string 123-ARGS-1A1-123456 as argument in function
R - Passing unquoted alpha-numeric string 123-ARGS-1A1-123456 as argument in function

Time:04-13

There must be 50 different questions on passing unquote strings as arguments, but none seem to involve alpha-numeric with hyphens, which appears to be a complicating factor.

I have a large list of text files, named with a hyphenated alpha-numeric strings. All of them start with ###-XXXX but can then be -###-#### or -XXX-XX#### or similar variations, which is to say that the naming convention in the database which creates the .txt files is not fixed. I'm trying to write a function which allows the operator to enter the alpha-numeric part number as as an unquoted string, without appending the ".txt" file type.

partData <- function(partNum) { 
dataRaw <- read.table(paste0(partNum,".txt") , 
                    header = TRUE,
                    sep = ",",
                    fileEncoding = "utf-16",
                    quote = ";"
                    )
}

This code works as long as the input is enclosed in quotes. In an effort to reduce the chance of input error, I've been trying to remove the requirement to enclose it, so that the operator can simply enter the part number as a string. All of the bquote(), deparse(), substitute() and similar result in a string with extraneous spaces and stripped numbers.

bquote(123-ARGS-111-123456)
123 - ARGS - 111 - 123456

bquote(123-ARGS-000-003456)
123 - ARGS - 0 - 3456

deparse(substitute(123-ARGS-000-123456))
[1] "123 - ARGS - 0 - 123456"

or errors

bquote(123-ARGS-1A1-123456)
Error: unexpected symbol in "bquote(123-ARGS-1A1"

deparse(123-ARGS-000-003456)
Error in deparse(123 - ARGS - 0 - 3456) : object 'ARGS' not found

How can I code this unquoted input correctly?

CodePudding user response:

If we want to pass unquoted input, use deparse(substitute

partData <- function(partNum) { 
   partNum <- deparse(substitute(partNum))
   dataRaw <- read.table(paste0(partNum,".txt") , 
                    header = TRUE,
                    sep = ",",
                    fileEncoding = "utf-16",
                    quote = ";"
                    )
   return(dataRaw)
 }

If we are passing input with digits as first character, use backquotes

partData(`123-ARGS-111-123456`)

-testing

> partData <- function(partNum) { 
    partNum <- deparse(substitute(partNum))
  partNum
}
> partData(123-ARGS-111-123456)
[1] "123 - ARGS - 111 - 123456"
> partData(`123-ARGS-111-123456`)
[1] "123-ARGS-111-123456"

CodePudding user response:

I really think this is not a good use of non-standard evaluation to capture invalid symbol names. If you are really going to have to deal with the - operator, then you could actually create your own environment to evaluate the expression to treat - as a paste operation.

So you could have a helper function like

fakepaste <- function(x, xquoted = substitute(x)) {
  ee <- new.env()
  mp <- function(a, b) {
    a <- substitute(a)
    b <- substitute(b)
  
    ex <- function(x) {
      if (is.numeric(x) | is.character(x)) {
        as.character(x)
      } else if (is.symbol(x)) {
        deparse(x)
      }else {
        eval(x, parent.frame())
      }
    }
    
    paste0(ex(a), "-", ex(b))
  }
  environment(mp) <- ee
  ee$`-` <- mp
  eval(xquoted, ee)
}

That would run like

fakepaste(123-ARGS-111-123456)
# [1] "123-ARGS-111-123456"

You do need to pass in an unevaluated expression so if you need to wrap it in another function then you would need to do something like

foo <- function(x) {
  fakepaste(xquoted=substitute(x))
}
foo(123-ARGS-111-123456)
# [1] "123-ARGS-111-123456"]

This is all conditional on the value you are passing in being a valid R expression. For example this would never work

fakepaste(12-34ARGS)

because that cannot be parsed.

It would be much easier to a function that would allow for proper character values or quoted symbol names.

An alternative approach would be to collapse the abstract syntax tree generated by the parser. Here's perhaps a more direct alternative

collapse_ast <- function(ast) {
  if (is.symbol(ast)) {
    deparse1(ast)
  } else if (is.character(ast) | is.numeric(ast)) {
    as.character(ast)
  } else {
    ast <- as.list(ast)
    paste0(collapse_ast(ast[[2]]), collapse_ast(ast[[1]]), collapse_ast(ast[[3]]))
  }
}

collapse_ast(quote(123-ARGS-111-123456))
# [1] "123-ARGS-111-123456"

This code make the strong assumption that all found operators are infix operators with exactly two parameters.

  • Related