Home > Software design >  How to split data to make it another column
How to split data to make it another column

Time:11-13

I am working on a voting data and trying to show most voted two parties. Problem is I want them in columns as, "first_party", "first_votes", "second_party", "second_votes". But I cannot transform my data with grouping or slicing.

d_Buyuksehir_2019 %\>%
group_by(city)%\>%
top_n(2)%\>%
group_by(city, party)
no  il     party   votes
1   Adana   CHP    656704   
1   Adana   MHP     523751  
2   Ankara  CHP     1662209 
2   Ankara  AK PARTİ  1538410   
3   Antalya CHP      714302 
3   Antalya AK PARTİ 652882 
4   Aydin  CHP   368791 
4   Aydin   AK PARTİ 299056 
5   Balikesi̇r  AK PARTİ 375539 
5   Balikesi̇rİ İYİ PARTİ 365333    

I expected it like


example <- data.frame(
    city = c("Adana","Ankara","Antalya","Balikesir"),
                      first_party = c("chp", "chp", "chp", "akp"),
                      first_votes = c(650000,160000,714000,375000),
                      second_party = c("mhp","akp","akp","iyi"),
    second_votes = c(523000,1530000,600000,365000))
example

city   first_party votes  second_party  second_party_votes
Adana     chp   650000    mhp          523000
Ankara    chp   160000    akp          1530000
Antalya   chp   714000    akp          600000
Balikesir akp   375000    iyi          365000

CodePudding user response:

The pivot_wider() function from the tidyr package will work.

The first step is to rank the first and second for each city and then pivot

df <- read.table(header =TRUE, text = "no  il     party   votes
1   Adana   CHP    656704   
1   Adana   MHP     523751  
2   Ankara  CHP     1662209 
2   Ankara  AKP  1538410   
3   Antalya CHP      714302 
3   Antalya AKP 652882 
4   Aydin  CHP   368791 
4   Aydin   AKP 299056 
5   Balikesi̇r AKP 375539 
5   Balikesi̇r IYI  365333    ")

library(tidyr)
#add the rank of 1 or 2 to each city
#assumes the list is sorted by city and votes
df$rank <- c("first", "second")

pivot_wider(df, id_cols = c("no", "il"), 
            names_from = c("rank"), values_from = c("party", "votes"), 
            names_glue = "{rank}_{.value}", names_sort = TRUE)

# A tibble: 5 × 6
     no il        first_party second_party first_votes second_votes
  <int> <chr>     <chr>       <chr>              <int>        <int>
1     1 Adana     CHP         MHP               656704       523751
2     2 Ankara    CHP         AKP              1662209      1538410
3     3 Antalya   CHP         AKP               714302       652882
4     4 Aydin     CHP         AKP               368791       299056
5     5 Balikesi̇r AKP         IYI               375539       365333
  • Related