I have data like this:
df<-structure(list(username = c("dan.amy", "dan.amy", "dan.amy",
"stupidski", "stupidski", "stupidski", "cbum", "cbum"), Department = c("Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute Pediatric Hematology Oncology",
"Cancer Institute Pediatric Hematology Oncology", "Cancer Institute Pediatric Hematology Oncology",
"Cancer Institute GynOnc", "Cancer Institute GynOnc"), `Access Control` = c("Yes",
"Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes"), `Organizational Unit` = c("Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute", "Cancer Institute",
"Cancer Institute", "Cancer Institute", "Cancer Institute"),
Management_Group.y = c("Cancer Institute - Pediatric Hematology/Oncology-LCI",
"Cancer Institute - Cancer Institute", "Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI",
"Cancer Institute - Cancer Institute", "Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI",
"Cancer Institute - Pediatric Hematology/Oncology-LCI", "Cancer Institute - Cancer Institute",
"Cancer Institute - Pediatric Hematology/Oncology-LCI")), row.names = c(NA,
-8L), spec = structure(list(cols = list(username = structure(list(), class = c("collector_character",
"collector")), Department = structure(list(), class = c("collector_character",
"collector")), `Access Control` = structure(list(), class = c("collector_character",
"collector")), `Organizational Unit` = structure(list(), class = c("collector_character",
"collector")), Management_Group.y = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x0000025f8c17de80>, class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
As you can see, each "username" has several rows that are nearly identical. Same username, often same department for that username, same access control and organizational unit for all these people (not necessarily in real data) but unique management groups. I'd like to add one more row for each person that is identical in several ways, except it would look like this:
I.e. for each username there would be a new row with "generic research department" under department, "General Research" under organizational unit, and "General Research - Generic Research Department" under management group. Access Control would always be "yes".
I've thought about ways to do this and was thinking maybe I'd creat a new sample 1 row data frame with those variables and then "join" it? But I figured there has to be a simpler way.
CodePudding user response:
Here is a dplyr
way:
library(dplyr)
df %>%
group_by(username) %>%
summarise(username = last(username)) %>%
mutate(Department = "Generic Research Department",
`Access Control` = "Yes",
`Organizational Unit` = "General Research",
Management_Group.y = paste(`Organizational Unit`, Department, sep = ' - ' )) %>%
bind_rows(df, .) %>%
arrange(username, .by_group = TRUE)
username Department `Access Control` `Organizational Unit` Management_Group.y
<chr> <chr> <chr> <chr> <chr>
1 cbum Cancer Institute GynOnc Yes Cancer Institute Cancer Institute - Cancer Institute
2 cbum Cancer Institute GynOnc Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
3 cbum Generic Research Department Yes General Research General Research - Generic Research Department
4 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
5 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Cancer Institute
6 dan.amy Cancer Institute Yes Cancer Institute Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI
7 dan.amy Generic Research Department Yes General Research General Research - Generic Research Department
8 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Cancer Institute
9 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Pediatric Hemophilia/Thrombosis Center - LCI
10 stupidski Cancer Institute Pediatric Hematology Oncology Yes Cancer Institute Cancer Institute - Pediatric Hematology/Oncology-LCI
11 stupidski Generic Research Department Yes General Research General Research - Generic Research Department