I know how I can remove everything before a space in a character string, but I'd like to keep the first initial of a name. How can I do this so the result is "F LastName"
? Thanks.
full_name <- "FirstName LastName"
sub(".*? ", "", full_name) # Removes everything before space.
CodePudding user response:
Using sub()
with a capture group we can try:
full_name <- "FirstName LastName"
output <- sub("([A-Z])\\w* (.*)", "\\1 \\2", full_name)
output
[1] "F LastName"
CodePudding user response:
A non-regex friendly way:
spl <- strsplit(full_name, " ")[[1]]
paste(substr(spl[1], 1, 1), spl[2])
#[1] "F LastName"
CodePudding user response:
Another way using sub
.
sub("(.)[^ ]*", "\\1", full_name)
#[1] "F LastName"
(.)
takes the first character and stores it in \\1
, [^ ]*
takes everything but not a space, \\1
inserts what was at (.)
.
Or with a look behind.
sub("(?<=.)[^ ]*", "", full_name, perl=TRUE)
#[1] "F LastName"
(?<=.)
looks behind if there is any character .
but does not consume it.
Benchmark
full_name <- rep("FirstName LastName", 1e5)
bench::mark(GKi1 = sub("(.)[^ ]*", "\\1", full_name),
GKi2 = sub("(?<=.)[^ ]*", "", full_name, perl=TRUE),
"Tim Biegeleisen" = sub("([A-Z])\\w* (.*)", "\\1 \\2", full_name),
Mael = {spl <- strsplit(full_name, " ") #Changed that it works on vectors
sapply(spl, \(spl) paste(substr(spl[1], 1, 1), spl[2]))}
)
# expression min median itr/s…¹ mem_al…² gc/se…³ n_itr n_gc total…⁴
# <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:by> <dbl> <int> <dbl> <bch:t>
#1 GKi1 22.6ms 22.7ms 44.0 781.3KB 0 22 0 500ms
#2 GKi2 14.6ms 14.8ms 67.1 781.3KB 1.97 34 1 507ms
#3 Tim Biegeleisen 47.7ms 47.8ms 20.9 781.3KB 0 11 0 526ms
#4 Mael 430ms 446.8ms 2.24 4.06MB 32.5 2 29 894ms
Using sub("(?<=.)[^ ]*", "", full_name, perl=TRUE)
is currently the fastest.