Home > database >  pivot_longer repeat entries instead of counting them
pivot_longer repeat entries instead of counting them

Time:03-04

I want to use pivot_longer to create a datatable which is easy to visualize. The code I am using looks something like this:

library(tidyverse)

relig_income %>%
  pivot_longer(!religion, names_to = "income", values_to = "count")

The output looks like this (not really, I changed the count column):

   religion income             count
   <chr>    <chr>              <dbl>
 1 Agnostic <$10k                 3
 2 Agnostic $10-20k               2
   ....

However, for my purposes, it would be much more useful if the output would look like this:

   religion income             
   <chr>    <chr>              
 1 Agnostic <$10k                
 2 Agnostic <$10k                
 3 Agnostic <$10k                
 4 Agnostic $10-20k               
 5 Agnostic $10-20k               
   ...               

So, basically, in the end there should only be two columns left and the income column should just repeat the specific value as often as the entry in the count column. Is there an option within pivot_longer or another R function which conveniently transforms the dataframe?

Any help is much appreciated!

CodePudding user response:

You can simply do uncount From package tidyr:

library(tidyverse)

relig_income %>%
  pivot_longer(!religion, names_to = "income", values_to = "count") %>%
  uncount(count)

# A tibble: 35,556 x 2
   religion income
   <chr>    <chr> 
 1 Agnostic <$10k 
 2 Agnostic <$10k 
 3 Agnostic <$10k 
 4 Agnostic <$10k 
 5 Agnostic <$10k 
 6 Agnostic <$10k 
 7 Agnostic <$10k 
 8 Agnostic <$10k 
 9 Agnostic <$10k 
10 Agnostic <$10k 
# ... with 35,546 more rows

CodePudding user response:

You can group_by religion and income, which contains the specific combination that you would like to expand. Then rep each row with the values in count. Finally remove the count column.

library(tidyverse)

relig_income %>%
  pivot_longer(!religion, names_to = "income", values_to = "count") %>% 
  group_by(religion, income) %>% 
  slice(rep(1:n(), each = count)) %>% 
  select(-count)
  • Related