Home > Enterprise >  How to get the words within the first single quote in r using regex?
How to get the words within the first single quote in r using regex?

Time:10-05

For example, I have two strings:

stringA = "'contentX' is not one of ['Illumina NovaSeq 6000', 'Other', 'Ion Torrent PGM', 'Illumina HiSeq X Ten', 'Illumina HiSeq 4000', 'Illumina NextSeq', 'Complete Genomics', 'Illumina Genome Analyzer II']"

I am not familiar how to do regex and stuck to extract words within the first single quotes.

Expected

## do regex here
gsub("'(.*)'", "\\1", stringA) # not working

> "contentX"

CodePudding user response:

For your example your pattern would be:

gsub("^'(.*?)'.*", "\\1", stringA)

https://regex101.com/r/bs3lwJ/1

First we assert we're at the beginning of the string and that the following character is a single quote with ^'. Then we capture everything up until the next single quote in group 1, using (.*?)'.

Note that we need the ? in .*? otherwise .* will be "greedy" and match all the way through to the last occurrence of a single quote, rather then the next single quote.

  • Related