Home > Net >  Replace leading period (".") in list of pathname strings R
Replace leading period (".") in list of pathname strings R

Time:11-11

I need to replace the first period (".") in a list of pathname strings, I know I should be able to do this with regex but can't find a good example.

Here is an example of the pathnames:

paths <- c("/Users/user/Study/data/ARaw_Sequences/16S_raw_sequences/RemovePrimer_Final.274571-3-2022_1.fq.gz", 
"/Users/user/Documents/R/Study/data/Raw_Sequences/16S_raw_sequences/RemovePrimer_Final.274575-15-2022_1.fq.gz")

I need to replace the period between Final and the id code 27... so it ends up like this -

RemovePrimer_Final_274575-15-2022_1.fq.gz

CodePudding user response:

basename() will return the filename from your filepaths and we can use sub() to only replace the first '.' character.

paths <- c("/Users/user/Study/data/ARaw_Sequences/16S_raw_sequences/RemovePrimer_Final.274571-3-2022_1.fq.gz", 
           "/Users/user/Documents/R/Study/data/Raw_Sequences/16S_raw_sequences/RemovePrimer_Final.274575-15-2022_1.fq.gz")

sub("\\.", "_", basename(paths))
#> [1] "RemovePrimer_Final_274571-3-2022_1.fq.gz" 
#> [2] "RemovePrimer_Final_274575-15-2022_1.fq.gz"

Created on 2022-11-10 with reprex v2.0.2

edit Removing the basename() will replace '.' with '_' in the full path.

paths <- c("/Users/user/Study/data/ARaw_Sequences/16S_raw_sequences/RemovePrimer_Final.274571-3-2022_1.fq.gz", 
           "/Users/user/Documents/R/Study/data/Raw_Sequences/16S_raw_sequences/RemovePrimer_Final.274575-15-2022_1.fq.gz")

sub("\\.", "_", paths)
#> [1] "/Users/user/Study/data/ARaw_Sequences/16S_raw_sequences/RemovePrimer_Final_274571-3-2022_1.fq.gz"            
#> [2] "/Users/user/Documents/R/Study/data/Raw_Sequences/16S_raw_sequences/RemovePrimer_Final_274575-15-2022_1.fq.gz"

The double slashes are required so R looks for a . character. Otherwise a regex with "." will match any character. The backslashes let us escape this matching and returns the explicit match we want.

  • Related