Home > database >  Why is my use of stringr stripping out too many characters?
Why is my use of stringr stripping out too many characters?

Time:08-04

In running the below demo code the user drags and drops elements from the top menu, to the section beneath, to build a dataframe. The first dataframe (output name choices), appearing in the upper right of the panel when running the code, renders fine. So for example if I dragged in elements Iron/Iron/Neon/Iron, in that order, choices renders correctly like this:

     choice decayCode
1 1. Iron A         1
2 2. Iron B         1
3 3. Neon A         1
4 4. Iron C         1

However, the second DF choices2, renders incorrectly as shown here (choices2 DF is simply a mirror of the above choices DF but stripping out a few characters as explained below):

  choice decayCode
1      A         1
2      B         1
3      A         1
4      C         1

I would like the choices2 DF to instead render as shown below, having stripped out the numeric prefix, ".", and space from each row of the "choice" column of the choices DF rendered above it while leaving in the name of each element followed by its sequential lettering suffix:

  choice decayCode
1 Iron A         1
2 Iron B         1
3 Neon A         1
4 Iron C         1

I must be doing something wrong in my use of character-stripping code using the stringr package in this line: dat %>% mutate(choice=stringr::str_sub(choice, -1)), where I tried stripping out the numeric/"."/space prefix of each listed element. What am I doing wrong here?

I'm open to character-stripping in base R; I prefer base R if it's possible since I'm trying to learn.

Demo code:

library(dplyr)
library(jsTreeR)
library(shiny)
library(stringr)

nodes <- list(
  list(
    text = "Pick elements from this list:",
    state = list(opened = TRUE),
    children = list(
      list(text = "Iron",type = "moveable"),
      list(text = "Neon",type = "moveable")
    )
  ),
  list(
    text = "Drag here to build DF:",
    type = "target",
    state = list(opened = TRUE)
  )
)

dnd <- list(
  always_copy = TRUE,
  inside_pos = "last", 
  is_draggable = JS(
    "function(node) {",
    "  return node[0].type === 'moveable';",
    "}"
  )
)

mytree <- jstree(
  nodes, 
  dragAndDrop = TRUE, dnd = dnd, 
  checkCallback = checkCallback,
  contextMenu = list(items = customMenu),
  types = list(moveable = list(), target = list())
)

script <- '
$(document).ready(function(){
  var LETTERS = ["A", "B", "C", "D"];
  var Visited = {};
  $("#mytree").on("copy_node.jstree", function(e, data){
    var oldid = data.original.id;
    var visited = Object.keys(Visited);
    if(visited.indexOf(oldid) === -1){
      Visited[oldid] = 0;
    }else{
      Visited[oldid]  ;
    }
    var letter = LETTERS[Visited[oldid]];
    var node = data.node;
    var id = node.id;
    var index = $("#" id).index()   1;
    var text = index   ". "   node.text   " "   letter;
    Shiny.setInputValue("choice", text);
    var instance = data.new_instance;
    instance.rename_node(node, text);
  });
});
'

ui <- fluidPage(
  tags$div(class = "header", checked = NA,tags$p(tags$script(HTML(script)))),
  fluidRow(
    column(width = 4,jstreeOutput("mytree")),
    column(width = 8,fluidRow(verbatimTextOutput("choices"),verbatimTextOutput("choices2")))
  )
)

server <- function(input, output, session){
  output[["mytree"]] <- renderJstree(mytree)
  
  Choices <- reactiveVal(data.frame(choice = character(0), decayCode = numeric(0)))
  
  observeEvent(input[["choice"]], {Choices(rbind(Choices(), data.frame(choice = input[["choice"]], decayCode = 1)))} )

  output[["choices"]] <- renderPrint({Choices()})
  
  output[["choices2"]] <- renderPrint({
    dat <- Choices()[rep(row.names(Choices()), Choices()[,2]), 1:2]
    ifelse(nrow(dat) == 0, dat, row.names(dat) <- seq(1:nrow(dat)))
    dat <- dat %>% mutate(choice=stringr::str_sub(choice, -1))
    dat
  })

}

shinyApp(ui=ui, server=server)

CodePudding user response:

In str_sub(), negative start and stop values “count backwards from the last character” – so with start = -1 you are extracting the substring starting from the last character.

To remove a fixed number of characters from the start, use a positive start value instead:

stringr::str_sub("1. Iron A", 4)
#> [1] "Iron A"

CodePudding user response:

Another more general solution would be to use regular expressions. I think it might be a better idea to use regular expressions sou you can include cases where you want to delete numbers with more than one digit. In the regular expression I am using, ^ stands for the start of the string, [0-9] , for one or more numbers, \\., for a point and , for a space. Then, gsub simply deletes the characters that match that pattern.

library(dplyr)

df |>
  mutate(new = gsub("^[0-9] \\. ", "", choice))

#     choice decayCode    new
#1 1. Iron A         1 Iron A
#2 2. Iron B         1 Iron B
#3 3. Neon A         1 Neon A
#4 4. Iron C         1 Iron C
  • Related