I'm sorry because I feel like versions of this question have been asked many times, but I simply cannot find code from other examples that works in this case. I have a column where all the information I want is stored in between two sets of "%%", and I want to extract this information between the two sets of parentheses and put it into a new column, in this case called df$empty.
This is a long column, but in all cases I just want the information between the sets of parentheses. Is there a way to code this out across the whole column?
To be specific, I want in this example a new column that will look like "information", "wanted".
empty <- c('NA', 'NA')
information <- c('notimportant%%information%%morenotimportant', 'ignorethis%%wanted%%notthiseither')
df <- data.frame(information, empty)
CodePudding user response:
In this case you can do:
df$empty <- sapply(strsplit(df$information, '%%'), '[', 2)
# information empty
# 1 notimportant%%information%%morenotimportant information
# 2 ignorethis%%wanted%%notthiseither wanted
That is, split the text by '%%'
and take second elements of the resulting vectors.
Or you can get the same result using sub()
:
df$empty <- sub('.*%%(. )%%.*', '\\1', df$information)