Hi i do have a Table for example:
House,Name1,Email1@xyz.com
Flat,Name2;Name3,Email2@xyz.com;Email3@xyz.com
Mobile Home,Name4,Email4@xyz.com
Camper-Van,Name5;Name6;Name7;Name8,Email5@xyz.com;Email6@xyz.com;Email7@xyz.com;Email8@xyz.com
and i do need:
House,Name1,Email1@xyz.com
Flat,Name2,Email2@xyz.com
Flat,Name3,Email3@xyz.com
Mobile Home,Name4,Email4@xyz.com
Camper-Van,Name5,Emil5@xyz.com
Camper-Van,Name6,Email6@xyz.com
Camper-Van,Name7,Email7@xyz.com
Camper-Van,Name8,Email8@xyz.com
the problem is, the ammount of names and emails for one kind of housing is unknown. i did generate three list:
Housing:
House
Flat
Campervan
Names:
Name1
Name2
Name3
Name4
Name5
Name6
Name7
Name8
Email:
Email1@xyz.com
Email2@xyz.com
...
Email8@xyz.com
But i am stuck how to repeat House and Flat and Campervan as much as there are names or emails (both always excact the same ammount) for each category in Column 1. This would make all List match each other in length. If i was abled to this i could just generate the Information i do need. any Help is appreciated.
ATTENTION: names and Email adress are not the same so for example Name1 is hans his email might be [email protected] by numbering names and emails i did try to show that emails and names are kind of sorted and can not be enlistetd randomly
CodePudding user response:
library(tidyverse)
example_text <-"House,Name1,Email@1
Flat,Name2;Name3,Email@2;Email@3
Mobile Home,Name4,Email@4
Camper-Van,Name5;Name6;Name7;Name8,Email@5;Email@6;Email@7;Email@8
"
example_text %>%
read_lines() %>%
map(~ {
# the first words until a delimiter
house <- .x %>% str_extract("^[^;,] ")
elements <- .x %>% str_remove(house) %>% str_split("[,;]") %>% simplify() %>% discard(~ .x == "")
# Everything with an @ symbol betwwen two demiliters (, or ;)
Emails <- elements %>% keep(~ .x %>% str_detect("@"))
# Everything which is not one of the above
Names <- elements %>% setdiff(Emails)
tibble(
House = house,
Emails = Emails,
Names = Names
)
}) %>%
reduce(bind_rows)
#> # A tibble: 8 x 3
#> House Emails Names
#> <chr> <chr> <chr>
#> 1 House Email@1 Name1
#> 2 Flat Email@2 Name2
#> 3 Flat Email@3 Name3
#> 4 Mobile Home Email@4 Name4
#> 5 Camper-Van Email@5 Name5
#> 6 Camper-Van Email@6 Name6
#> 7 Camper-Van Email@7 Name7
#> 8 Camper-Van Email@8 Name8
Created on 2021-11-24 by the reprex package (v2.0.1)
CodePudding user response:
With the data in a data.table (convert using setDT()
), using data.table joins and the data.table tstrsplit()
function:
library(data.table)
# Data for the demo (please provide this yourself in future questions)
dt1 <-
data.table(type = c("House", "Flat", "Mobile", "Camper-van"),
name = c("Name1", "Name2;Name3", "Name4", "Name5;Name6;Name7;Name8"),
mail = c("Email1", "Email2;Email3", "Email4", "Email5;Email6;Email7;Email8"))
# solution
dt1[, c("type" = list(type), tstrsplit(name, ";"))][, melt(.SD, id.vars="type")][!is.na(value), .(.I, type, "name" = value)][
dt1[, c("type" = list(type), tstrsplit(mail, ";"))][, melt(.SD, id.vars="type")][!is.na(value), .(.I, "mail" = value)], on="I"][, -c("I")]