Home > Software design >  using tidyr separate function to split by \ backslash
using tidyr separate function to split by \ backslash

Time:12-15

I would like to split text in a column by '' using the separate function in tidyr. Given this example data...

library(tidyr) 
df1 <- structure(list(Parent.objectId = 1:2, Attachment.path = c("photos_attachments\\photos_image-20220602-192146.jpg", 
    "photos_attachments\\photos_image-20220602-191635.jpg")), row.names = 1:2, class = "data.frame")

And I've tried multiple variations of this...

df2 <- df1 %>%
  separate(Attachment.path,c("a","b","c"),sep="\\",remove=FALSE,extra="drop",fill="right")

Which doesn't result in an error, but it doesn't split the string into two columns, likely because I'm not using the correct regular expression for the single backslash.

CodePudding user response:

We may need to escape

library(tidyr)
separate(df1, Attachment.path,c("a","b","c"),
        sep= "\\\\", remove=FALSE, extra="drop", fill="right")

According to ?separate

sep - ... The default value is a regular expression that matches any sequence of non-alphanumeric values.

CodePudding user response:

By splitting on \, assuming you are trying to get folder and filenames, try these 2 functions:

#get filenames
basename(df1$Attachment.path)
# [1] "photos_image-20220602-192146.jpg" "photos_image-20220602-191635.jpg"

#get foldernames
basename(dirname(df1$Attachment.path))
# [1] "photos_attachments" "photos_attachments"
  • Related