I have some columns in a data frame that look like this:
df <- data.frame(act=c("DEC S/N, de 21/06/2006",
"DEC S/N, de 05/06/2006",
"DEC S/N, de 21/06/2006; MP 542, de 12/08/2011; LEI 12.678, de 25/06/2012"), adj=NA)
I would like to copy everything after the first ; (MP 542, de 12/08/2011; LEI 12.678, de 25/06/2012) in the column 'act', into the column 'adj'. Ideally, removing the space that would be left at the star of the cut-off string. All other cells, this is, where the strings in column 'act' do not have a ; should be left NA in column 'adj'.
CodePudding user response:
Here I'll use an ifelse
statement to look for ";" by grepl()
, then use some low-level regex to capture the strings after the first ";" into the act
column.
library(dplyr)
df %>% mutate(adj = ifelse(grepl(";", act),
gsub("^. ?(?<=;) (. ?)$", "\\1", act, perl = T),
adj))
CodePudding user response:
Using str_match
from stringr
:
df <- data.frame(act=c("DEC S/N, de 21/06/2006",
"DEC S/N, de 05/06/2006",
"DEC S/N, de 21/06/2006; MP 542, de 12/08/2011; LEI 12.678, de 25/06/2012"), adj=NA)
df %>% mutate(adj = str_match(act, "[^;]*;(.*)")[,2])
CodePudding user response:
Using stringr::str_extract
-
df$adj <- stringr::str_extract(df$act, '(?<=;\\s)(.*)')
df$adj
#[1] NA NA "MP 542, de 12/08/2011; LEI 12.678, de 25/06/2012"
CodePudding user response:
df %>%
extract(act, 'adj', '; (.*)', remove = FALSE)
or even try:
df %>%
separate(act, c('act1', 'adj'), '; ',
extra = 'merge', fill = 'right', remove = FALSE)