I have a vector of strings:
A <- c("Hello world", "Green 44", "Hot Beer", "Bip 6t")
I want to add an asterisk (*) at the beginning and at the end of every first word like this:
"*Hello* world", "*Green* 44", "*Hot* Beer", "*Bip* 6t"
Make sense to use str_replace()
from stringr
.
However, I am struggling with regex to match the first word of each string.
The best achievement ended up with:
str_replace(A, "^([A-Z])", "*\\1*"))
"*H*ello world", "*G*reen 44", "*H*ot Beer", "*B*ip 6t"
I might expect to be a straightforward task, but I am not getting along with regex.
Thanks!
CodePudding user response:
You were almost there
str_replace(A, "(^.*) ", "*\\1* ")
#> [1] "*Hello* world" "*Green* 44" "*Hot* Beer" "*Bip* 6t"
CodePudding user response:
You can use
sub("([[:alpha:]] )", "*\\1*", A)
## => [1] "*Hello* world" "*Green* 44" "*Hot* Beer" "*Bip* 6t"
The stringr
equivalent is
library(stringr)
stringr::str_replace(A, "([[:alpha:]] )", "*\\1*")
stringr::str_replace(A, "(\\p{L} )", "*\\1*")
See the R demo online. See the regex demo online.
The ([[:alpha:]] )
regex matches and captures one or more letters into Group 1 and *\1*
replacement replaces the match with *
Group 1 value *
.
Note that sub
finds and replaces the first match only, so only the first word is affected in each character vector.
Notes
- If you plan to wrap the word exactly at the start of a string (not just the "first word"), add
^
at the start of the pattern (e.g.sub("^([[:alpha:]] )", "*\\1*", A)
) - If the word is a chunk of non-whitespace chars, use
\S
instead of[[:alpha:]]
or\p{L}
(e.g.sub("^(\\S )", "*\\1*", A)
) - If the word is any chunk of letters or digits or underscores, you can use
\w
, i.e.sub("^(\\w )", "*\\1*", A)
- If the word is any chunk of letters or digits but not underscores, you can use
[[:alnum:]]
, i.e.sub("^([[:alnum:]] )", "*\\1*", A)