Home > Mobile >  Trying to extract a substring from a major strings containing special characters
Trying to extract a substring from a major strings containing special characters

Time:11-29

I am trying to extract a substring from the following strings:

x <- "U1  ^Eucalyptus baxteri s.s.,Eucalyptus viminalis subsp. cygnetensis\\^tree\\7\\i;U2 Acacia melanoxylon,Banksia marginata\\tree\\7\\i;M1 ^Leucopogon parviflorus,Spyridium parvifolium,Leucopogon lanceolatus var. lanceolatus\\^shrub\\4\\r;M2 Xanthorrhoea minor subsp. lutea,Pteridium esculentum,Billardiera scandens\\fern,grass-tree,vine\\3\\i;G1 ^Veronica calycina,Brunonia australis,Deyeuxia quadriseta,Dianella revoluta var. revoluta s.l.,Dichelachne crinita\\rush,^forb,tussock grass,other grass,sedge\\2\\c;G2 Lagenophora stipitata,Luzula meridionalis,Lomandra nana,Pimelea humilis,Acrotriche serrulata\\rush,heath shrub,forb,vine,sedge\\1\\i"

I am wishing to extract 6\c as a subset that comes immediately after \^tree however, I am facing issues usisng sub() function and thats probably related to the special characters existing in the main string. Any help is appreciated.

CodePudding user response:

sub(".*?\\^tree([^;] ).*", "\\1", x)
[1] "\\7\\i"

Note that the double backslash is just a single literal backslash.

cat(sub(".*?\\^tree([^;] ).*", "\\1", x))
\7\i
  • Related