Home > other >  Format author's name with stringr
Format author's name with stringr

Time:11-20

I would like to format a string with authors' names.

Daenerys Targaryen to TARGARYEN, D.

George R. R. Martin to MARTIN, G. G. R.

Luís Inácio Lula da Silva to SILVA, L. I. L. da

The pattern is LAST, 1st. 2nd. 3rd. ....

It would be awesome if it's possible to format multiple names in one string, like

Daenerys Targaryen, Luís Inácio Lula da Silva to TARGARYEN, D.; SILVA, L. I. L. da

CodePudding user response:

Solution using gsub() with capture groups, and "\\U...\\E" flags to capitalize last names.

library(magrittr)

x <- c("Daenerys Targaryen, George R. R. Martin, Luís Inácio Lula da Silva")

x %>%
  strsplit(", ") %>%
  unlist() %>% 
  gsub("(.*?) (\\w $)", "\\U\\2\\E, \\1", ., perl = TRUE) %>%
  gsub(" ([A-Z])\\w*\\.?", " \\1.", .) %>%
  paste(collapse = "; ")

# [1] "TARGARYEN, D.; MARTIN, G. R. R.; SILVA, L. I. L. da"

CodePudding user response:

Here is a function coded in base R that processes a strings and outputs the expected result.

fun <- function(x) {
  y <- strsplit(x, " ")
  sapply(y, \(s) {
    if(any(nchar(s) == 0L))
      s <- s[nchar(s) > 0L]
    if(all(nchar(s))) {
      n <- length(s)
      out <- character(n)
      out[1L] <- toupper(s[n])
      if(n > 1L)
        out[1L] <- paste0(out[1L], ",")
      first <- substr(s[seq.int(n)[-n]], 1L, 1L)
      i <- first == toupper(first)
      out[-1L][i] <- paste0(first[i], ".")
      out[-1L][!i] <- s[!i]
      paste(out, collapse = " ")
    } else ""
  })
}

x <- c("Daenerys Targaryen",
       "George R. R. Martin",
       "Luís Inácio Lula da Silva")

fun(x)
#> [1] "TARGARYEN, D."      "MARTIN, G. R. R."   "SILVA, L. I. L. da"

Created on 2022-11-19 with reprex v2.0.2


Edit

To process a string with several names and output one string, do it in two steps.

y <- c("Daenerys Targaryen, Luís Inácio Lula da Silva")
ll <- lapply(strsplit(y, ", "), fun)
do.call(\(x) paste(x, collapse = "; "), ll)
#> [1] "TARGARYEN, D.; SILVA, L. I. L. da"

Created on 2022-11-19 with reprex v2.0.2

  • Related