I am trying to extract a substring from the following strings:
x <- "U1 ^Eucalyptus baxteri s.s.,Eucalyptus viminalis subsp. cygnetensis\\^tree\\7\\i;U2 Acacia melanoxylon,Banksia marginata\\tree\\7\\i;M1 ^Leucopogon parviflorus,Spyridium parvifolium,Leucopogon lanceolatus var. lanceolatus\\^shrub\\4\\r;M2 Xanthorrhoea minor subsp. lutea,Pteridium esculentum,Billardiera scandens\\fern,grass-tree,vine\\3\\i;G1 ^Veronica calycina,Brunonia australis,Deyeuxia quadriseta,Dianella revoluta var. revoluta s.l.,Dichelachne crinita\\rush,^forb,tussock grass,other grass,sedge\\2\\c;G2 Lagenophora stipitata,Luzula meridionalis,Lomandra nana,Pimelea humilis,Acrotriche serrulata\\rush,heath shrub,forb,vine,sedge\\1\\i"
I am wishing to extract 6\c
as a subset that comes immediately after \^tree
however, I am facing issues usisng sub()
function and thats probably related to the special characters existing in the main string. Any help is appreciated.
CodePudding user response:
sub(".*?\\^tree([^;] ).*", "\\1", x)
[1] "\\7\\i"
Note that the double backslash is just a single literal backslash.
cat(sub(".*?\\^tree([^;] ).*", "\\1", x))
\7\i