I have a list that has the following structure:
[1] "Atp|Barcelona|Concentration(ng/mL)|8|FALSE"
I want to extract the third element (separating by the |
symbol, and removing for the given string everything that is after the (
symbol.
So I would get this character:
[1] "Concentration"
What I do is first split by the |
symbol. Then, get the third element of the generated list.
In order to be able to use gsub I convert to character, and then I apply gsub function, like follows.
y <- "Atp|Barcelona|Concentration(ng/mL)|8|FALSE"
y <- strsplit(y, "\\|")
y <- y[[1]][3]
y <- as.character(y)
gsub("(.*","",y)
However, this error is prompted:
invalid regular expression '(.*', reason 'Missing ')''
CodePudding user response:
You may use strsplit
with unlist
here:
x <- "Atp|Barcelona|Concentration(ng/mL)|8|FALSE"
output <- unlist(strsplit(x, "\\|"))[3]
output
[1] "Concentration(ng/mL)"
If some inputs might have have at least two |
separators, then you may first check the size of the vector output from the above before trying to access the third element.
CodePudding user response:
First of all, you don't need y <- as.character(y)
, since the result would already be of class "character".
Second, your problem lies in the pattern inside gsub()
, where you need to escape the opening bracket. Therefore your full code should be:
y <- "Atp|Barcelona|Concentration(ng/mL)|8|FALSE"
y <- strsplit(y, "\\|")
y <- y[[1]][3]
gsub("\\(.*","",y)
[1] "Concentration"