Home > other >  How best to do this pivot operation in R
How best to do this pivot operation in R

Time:01-26

Below is the sample data and the desired outcome. This is a much simplified version of the actual data set. In the actual data set, there are 20 years and 4 quarters apiece. Looking to have each unique company entry listed once and the employment data series running from beginning to end from left to right. In the event that there is no data for Vision Inc in 2019 quarter 3, then I would want it to return a O and not an NA.

 library(tidyverse)
 library(dplyr)

 legalname <- c("Vision Inc.","Expedia","Strong Enterprise","Vision Inc.","Expedia","Strong Enterprise")
 year <- c(2019,2019,2019,2019,2019,2019)
 quarter <- c(1,1,1,2,2,2)
 cnty <- c(031,029,027,031,029,027)
 naics <- c(345110,356110,362110,345110,356110,345110)
 mnth1emp <- c (11,13,15,15,17,20)
 mnth2emp <- c(12,14,15,16,18,22)
 mnth3emp <-c(13,15,15,17,21,29)

 employers <- data.frame(legalname,year,quarter,naics,mnth1emp,mnth2emp,mnth3emp)

Desired Outcome

 legalname       cnty      naics     2019m1      2019m2   2019m3   2019m4   2019m5   2019m6
 Vision Inc        031      345110      11            12      13       15       16        17
 Expedia           029      356110      13            14      15       17       18        21  

CodePudding user response:

I first pivot to a long form, then arrange by legalname and year(just to double-check that they are in numerical order). Then, I create a unique month series for each year for each company. Then, I drop quarter and pivot back to wide form and put name and year together, and finally replace NA with 0. Here, I'm assuming that you want each unique naics on it's own row.

library(tidyverse)

employers %>% 
  pivot_longer(starts_with("mnth")) %>% 
  arrange(legalname, year) %>% 
  group_by(legalname, year, naics) %>% 
  mutate(name = paste0("m", 1:n())) %>% 
  select(-quarter) %>% 
  pivot_wider(names_from = c("year", "name"), names_sep = "", values_from = "value") %>% 
  mutate(across(everything(), ~replace_na(.,0)))

Output

  legalname          naics `2019m1` `2019m2` `2019m3` `2019m4` `2019m5` `2019m6`
  <chr>              <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
1 Expedia           356110       13       14       15       17       18       21
2 Strong Enterprise 362110       15       15       15        0        0        0
3 Strong Enterprise 345110        0        0        0       20       22       29
4 Vision Inc.       345110       11       12       13       15       16       17

CodePudding user response:

Does this work for you?

First pivot longer to get the months and values in a quarter; and then pivot wider to get the wide format you want.

employers %>%
  filter(legalname != "Strong Enterprise") %>%
  pivot_longer(mnth1emp:mnth3emp, names_to = "mnth", values_to = "value") %>% 
  mutate(month_in_quarter = as.numeric(str_extract(mnth, "\\d")),
         month =str_c("m", month_in_quarter    3*(quarter - 1)))  %>% 
  select(-c(month_in_quarter, mnth)) %>%
   pivot_wider(c(legalname,cnty, naics), names_from = c(year, month),
              values_from = value,
values_fill = 0)

values_fill will fill NAs with 0s.

  •  Tags:  
  • Related