I have a dataset
that I am trying to tidy
up using different approaches. For step one I want to merge every two
rows in each of the columns to a single
row as shown in the desired
output.
How can I do this in R
by the tidy
way?
Sample Data
Date = c("SB",
"1/4/2021",
"HC/SB",
"1/5/2021",
"NC",
"1/6/2021",
"HC",
"1/13/2021")
Date_Approved = c(" ",
"1/4/2021",
" ",
"1/8/2021",
" ",
"1/12/2021",
" ",
"1/15/2021")
SR = c(" ",
"1A",
" ",
"1B",
" ",
"1C",
" ",
"1D")
Permit = c(" ",
"AAA",
" ",
"BBB",
" ",
"CCC",
" ",
"DDD")
Owner_Agent = c("Joe",
"Joey",
"Ross",
"Chandler",
"Monica",
"Rachel",
"Ed",
"Edd",
"Eddy")
Address = c("1111 W. Broward Boulevard",
"Plantation, 33333",
"2222 N 23 Avenue",
"Hollywood, FL 33322",
"3333 Taylor Street",
"Hollywood, 33311",
"44444 NW 19th St",
"5555 Oak St",
"Pembroke Pines, 33300")
The original data looks like this:
Desired Output
Date Date_Approved SR Permit Owner_Agent
SB 1/4/2021 1/4/2021 1A AAA Joe, Joey
HC/SB 1/5/2021 1/8/2021 1B BBB Chandler, Monica
NC 1/6/2021 1/12/2021 1C CCC Rachel, Ed
HC 1/13/2021 1/15/2021 1D DDD Edd, Eddy
Address
1111 W. Broward Boulevard Plantation, 33333
2222 N 23 Avenue Hollywood, FL 33322
3333 Taylor Street Hollywood, 33311
44444 NW 19th St Pembroke Pines, 33300
I have looked up this and this, but using group_by
messes up the df
.
Code
library(tidyverse)
df = data.frame(Date,
Date_Approved,
SR,
Permit,
Owner_Agent,
Address)
# Tidy up the df
df = df %>%
CodePudding user response:
You can try to create a row identifier, group by that id, and use summarize(across())
as below:
df %>%
mutate(id=rep(1:(n()/2), each=2)) %>%
group_by(id) %>%
summarize(across(Date:Address, ~trimws(paste0(.x, collapse=" "))))
Output:
# A tibble: 4 × 7
id Date Date_Approved SR Permit Owner_Agent Address
<int> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 SB 1/4/2021 1/4/2021 1A AAA Joe Joey 1111 W. Broward Boulevard Plantation, 33333
2 2 HC/SB 1/5/2021 1/8/2021 1B BBB Ross Chandler 2222 N 23 Avenue Hollywood, FL 33322
3 3 NC 1/6/2021 1/12/2021 1C CCC Monica Rachel 3333 Taylor Street Hollywood, 33311
4 4 HC 1/13/2021 1/15/2021 1D DDD Ed Eddy 44444 NW 19th St Pembroke Pines, 33300
Input:
structure(list(Date = c("SB", "1/4/2021", "HC/SB", "1/5/2021",
"NC", "1/6/2021", "HC", "1/13/2021"), Date_Approved = c(" ",
"1/4/2021", " ", "1/8/2021", " ", "1/12/2021", " ", "1/15/2021"
), SR = c(" ", "1A", " ", "1B", " ", "1C", " ", "1D"), Permit = c(" ",
"AAA", " ", "BBB", " ", "CCC", " ", "DDD"), Owner_Agent = c("Joe",
"Joey", "Ross", "Chandler", "Monica", "Rachel", "Ed", "Eddy"),
Address = c("1111 W. Broward Boulevard", "Plantation, 33333",
"2222 N 23 Avenue", "Hollywood, FL 33322", "3333 Taylor Street",
"Hollywood, 33311", "44444 NW 19th St", "Pembroke Pines, 33300"
)), class = "data.frame", row.names = c(NA, -8L))