Home > database >  Unnest or move rows to columns?
Unnest or move rows to columns?

Time:04-25

This is just one of those things that I can't figure out how to word in order to search for a solution to my problem. I have some election data for Democratic and Republican candidates. The data is contained in 2 rows per county with one of those rows corresponding to one of the two candidates.

I need a data frame with one row per county and I need to create a new column out of the second row for each county. I've tried to un-nest the dataframe, but that doesn't work. I've seen something about using un-nest and mutate together, but I can't figure that out. Transposing the dataframe didn't help either. I've also tried to ungroup without success.

# Load Michigan 2020 by-county election data
# Data: https://mielections.us/election/results/DATA/2020GEN_MI_CENR_BY_COUNTY.xls
election <- read.csv("2020GEN_MI_CENR_BY_COUNTY.txt", sep = "\t", header = TRUE)
# Remove unnecessary columns
election <- within(election, rm('ElectionDate','OfficeCode.Text.','DistrictCode.Text.','StatusCode','CountyCode','OfficeDescription','PartyOrder','PartyName','CandidateID','CandidateFirstName','CandidateMiddleName','CandidateFormerName','WriteIn.W..Uncommitted.Z.','Recount...','Nominated.N..Elected.E.'))

# Remove offices other than POTUS
election <- election[-c(167:2186),]

# Keep only DEM and REP parties
election <- election %>% 
     filter(PartyDescription == "Democratic" |
            PartyDescription == "Republican")

[Here's what it looks like now...

I'd like it to look like this:

enter image description here

CodePudding user response:

dplyr

library(dplyr)
library(tidyr) # pivot_wider
election %>%
  select(CountyName, PartyDescription, CandidateLastName, CandidateVotes) %>%
  slice(-(167:2186)) %>%
  filter(PartyDescription %in% c("Democratic", "Republican")) %>%
  pivot_wider(CountyName, names_from = CandidateLastName, values_from = CandidateVotes)
# # A tibble: 83 x 25
#    CountyName Biden Trump Richer LaFave Cambensy Wagner Metsa Markkanen Lipton Strayhorn Carlone Frederick Bernstein Diggs Hubbard Meyers Mosallam Vassar `O'Keefe` Schuitmaker Dewaelsche Stancato Gates  Land
#    <chr>      <int> <int>  <int>  <int>    <int>  <int> <int>     <int>  <int>     <int>   <int>     <int>     <int> <int>   <int>  <int>    <int>  <int>     <int>       <int>      <int>    <int> <int> <int>
#  1 ALCONA      2142  4848     NA     NA       NA     NA    NA        NA   1812      1748    4186      4209      1818  1738    4332   4114     1696   1770      4273        4187       1682     1733  4163  4223
#  2 ALGER       2053  3014     NA     NA     2321   2634    NA        NA   1857      1773    2438      2470      1795  1767    2558   2414     1757   1769      2538        2444       1755     1757  2458  2481
#  3 ALLEGAN    24449 41392     NA     NA       NA     NA    NA        NA  20831     19627   37681     38036     20043 19640   38805  37375    18820  19486     37877       39052      19081    19039 37322 38883
#  4 ALPENA      6000 10686     NA     NA       NA     NA    NA        NA   5146      4882    8845      8995      5151  4873    9369   8744     4865   4935      9212        8948       4816     4923  9069  9154
#  5 ANTRIM      5960  9748     NA     NA       NA     NA    NA        NA   5042      4798    8828      8886      4901  4797    9108   8737     4686   4810      9079        8867       4679     4781  8868  9080
#  6 ARENAC      2774  5928     NA     NA       NA     NA    NA        NA   2374      2320    4626      4768      2396  2224    4833   4584     2215   2243      5025        4638       2185     2276  4713  4829
#  7 BARAGA      1478  2512     NA     NA       NA     NA  1413      2517   1267      1212    2057      2078      1269  1233    2122   2003     1219   1243      2090        2056       1226     1228  2072  2074
#  8 BARRY      11797 23471     NA     NA       NA     NA    NA        NA   9794      9280   20254     20570      9466  9215   20885  20265     9060   9324     21016       20901       8967     9121 20346 21064
#  9 BAY        26151 33125     NA     NA       NA     NA    NA        NA  23209     22385   26021     26418     23497 22050   27283  25593    21757  22225     27422       25795      21808    21999 26167 26741
# 10 BENZIE      5480  6601     NA     NA       NA     NA    NA        NA   4704      4482    5741      5822      4584  4479    6017   5681     4379   4449      5979        5756       4392     4353  5704  5870
# # ... with 73 more rows

CodePudding user response:

@r2evans had the right idea, but slicing the data before filtering lost a lot of the voting data. I hadn't realized that before.

# Load Michigan 2020 by-county election data
# Data: https://mielections.us/election/results/DATA/2020GEN_MI_CENR_BY_COUNTY.xls
election <- read.csv("2020GEN_MI_CENR_BY_COUNTY.txt", sep = "\t", header = TRUE)

# That's an ugly dataset...let's make it better
election <- election[-c(1:5,7:9,11,13:15,17:19)]
election <- election %>%
  filter(CandidateLastName %in% c("Biden", "Trump")) %>% 
  select(CountyName, PartyDescription, CandidateLastName, CandidateVotes) %>%
  pivot_wider(CountyName, names_from = CandidateLastName, values_from = CandidateVotes)
  • Related