I'm having difficulties implementing a solution for this question provided by users on many similar questions like this. See sample df below.
structure(list(FirstName = c("Albus Percival Wulfric Brian Dumbledore",
"Harry James Potter", "Tom Marvollo Riddle", "Lord Voldemort"
), Email = c("[email protected]", "[email protected]", "[email protected]",
"[email protected]"), ClassSection = c("HeadMaster", "Student", "Dark Lord in training",
"Dark Lord")), row.names = c(NA, -4L), spec = structure(list(
cols = list(FirstName = structure(list(), class = c("collector_character",
"collector")), Email = structure(list(), class = c("collector_character",
"collector")), ClassSection = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
I want to create a new column, where the first and last names are united. For this,
I first tried separate(FirstName, sep = " ", into("First", "Middle", Last")
. However, what happens is that there are other word elements that get missed. So, I'm not able to effectively combine them together.
Next, I tried, df%>% mutate(First = str_split(FirstName, pattern = " "))
. This gives a list of elements. I want a way to extract the first and the last element from this column.
# A tibble: 4 x 4
FirstName Email ClassSection First
<chr> <chr> <chr> <list>
1 Albus Percival Wulfric Brian Dumbledore [email protected] HeadMaster <chr [4]>
2 Harry James Potter [email protected] Student <chr [3]>
3 Tom Marvollo Riddle [email protected] Dark Lord in training <chr [3]>
4 Lord Voldemort [email protected] Dark Lord <chr [2]>
I looked at various answers where tail(First, n=1)
and dplyr's last(First)
was suggested. However, these don't give me the right answer. I also tried unnest_wider(First)
but it has the same problem as separate(firstName)
. That is, I see multiple columns. Now these don't work for names that are just two or more than 3 words.
I'm looking to continue the dplyr (tidyverse's) workflow. Is there a way I can get the first and last vector to combine together into a new column?
CodePudding user response:
Do you mean something like this?
df %>%
mutate(
FirstLast = sapply(str_split(FirstName, pattern = " "),
\(z) paste(z[unique(c(1, length(z)))], collapse = ""))
)
# # A tibble: 4 × 4
# FirstName Email ClassSection FirstLast
# <chr> <chr> <chr> <chr>
# 1 Albus Percival Wulfric Brian Dumbledore [email protected] HeadMaster AlbusDumbledore
# 2 Harry James Potter [email protected] Student HarryPotter
# 3 Tom Marvollo Riddle [email protected] Dark Lord in training TomRiddle
# 4 Lord Voldemort [email protected] Dark Lord LordVoldemort
or much more simply
df %>%
mutate(FirstLast = sub(" .* ", "", FirstName))
# # A tibble: 4 × 4
# FirstName Email ClassSection FirstLast
# <chr> <chr> <chr> <chr>
# 1 Albus Percival Wulfric Brian Dumbledore [email protected] HeadMaster AlbusDumbledore
# 2 Harry James Potter [email protected] Student HarryPotter
# 3 Tom Marvollo Riddle [email protected] Dark Lord in training TomRiddle
# 4 Lord Voldemort [email protected] Dark Lord Lord Voldemort
CodePudding user response:
We may use extract
library(tidyr)
extract(df, FirstName, into = c("First", "Last"),
"^(\\S )\\s*.*\\s (\\S )$", remove = FALSE)
-output
# A tibble: 4 × 5
FirstName First Last Email ClassSection
<chr> <chr> <chr> <chr> <chr>
1 Albus Percival Wulfric Brian Dumbledore Albus Dumbledore [email protected] HeadMaster
2 Harry James Potter Harry Potter [email protected] Student
3 Tom Marvollo Riddle Tom Riddle [email protected] Dark Lord in training
4 Lord Voldemort Lord Voldemort [email protected] Dark Lord
Or to extract from the list
library(purrr)
library(dplyr)
df%>%
mutate(First = str_split(FirstName, pattern = " "), .after = FirstName) %>%
mutate(First = map(First, ~ tibble(First = first(.x),
Last = last(.x)))) %>%
unnest_wider(First)
-output
# A tibble: 4 × 5
FirstName First Last Email ClassSection
<chr> <chr> <chr> <chr> <chr>
1 Albus Percival Wulfric Brian Dumbledore Albus Dumbledore [email protected] HeadMaster
2 Harry James Potter Harry Potter [email protected] Student
3 Tom Marvollo Riddle Tom Riddle [email protected] Dark Lord in training
4 Lord Voldemort Lord Voldemort [email protected] Dark Lord