Home > Back-end >  Reshaping table into wide format with unique rows
Reshaping table into wide format with unique rows

Time:03-10

I am using R to analyze zipcode and council district data in NYC to determine which zip codes are represented by a council district. I have a table with zips and districts that is currently in long format with two columns.

ZIPCODE CouncilDistrict
11436 28
11213 36
11213 41
11213 35
11212 41
11212 37
11212 42

But I want the table to be in a wide format. Essentially, since some zip codes can exist in multiple districts, I want each row to represent a unique zipcode and have columns telling me which council district(s) that zipcode is in.

ZIPCODE 1 2 3
11436 28 NA NA
11213 36 41 35
11212 41 37 42

I have tried all of the following functions with no luck:

testing7 <- dcast(testing, ZIPCODE ~ CounDist)
testing33 <- reshape(testing, idvar = "ZIPCODE", timevar = "ZIPCODE", direction = "wide")
testing33 <- testing %>%
  rownames_to_column() %>% 
  pivot_longer(!rowname, names_to = "ZIPCODE", values_to = "CounDist")%>% 
  pivot_wider(names_from = "rowname", values_from = "CounDist")

The closest I have gotten is a 1 x 558 table with each column representing a zip code and the each cell being a list of each district that zip code is in.

How do I get my table into this wide format?

CodePudding user response:

One option to achieve your desired result would be to group by ZIPCODE and add an id column before applying pivot_wider:

library(tidyr)
library(dplyr)

testing %>%
  group_by(ZIPCODE) %>% 
  mutate(id = row_number()) %>% 
  pivot_wider(names_from = "id", values_from = "CouncilDistrict")
#> # A tibble: 3 × 4
#> # Groups:   ZIPCODE [3]
#>   ZIPCODE   `1`   `2`   `3`
#>     <int> <int> <int> <int>
#> 1   11436    28    NA    NA
#> 2   11213    36    41    35
#> 3   11212    41    37    42

DATA

testing <- data.frame(
          ZIPCODE = c(11436L, 11213L, 11213L, 11213L, 11212L, 11212L, 11212L),
  CouncilDistrict = c(28L, 36L, 41L, 35L, 41L, 37L, 42L)
)
  • Related