Home > other >  Mutate to create two number columns from one number column based on a character value
Mutate to create two number columns from one number column based on a character value

Time:12-31

My data "x" looks like this:

date   type  cost
20-01  supp  5
20-02  supp  10
20-03  supp  5
20-01  svcs  2
20-02  svcs  4
20-03  svcs  8

I want to create a cost for each type in order to plot a multiple time series, which I can do by creating two separate time series, but what I'd like to do is create:

bydate <- aggregate(cbind(supp, svcs)~date, data=y, FUN=sum)

With my data "y" looking like this:

date   type  supp  svcs
20-01  supp  5     0
20-02  supp  10    0
20-03  supp  5     0
20-01  svcs  0     2
20-02  svcs  0     4
20-03  svcs  0     8

In this way I hope to create

df <- bydate %>% 
  select(date, supp, svcs) %>% 
  gather(key = "variable", value = "value", -date)

Is the mutate function the way to do this?

CodePudding user response:

We have to create index variables before pivoting, and then pivot_wider with the valueS_fill argument set to 0.

library(tidyr)
library(dplyr)

df %>%
        mutate(index = row_number(),
               type2 = type) %>%
        pivot_wider(names_from = type2, values_from = cost, values_fill = 0) %>%
        select(-index)
  

# A tibble: 6 × 4
  date  type   supp  svcs
  <chr> <chr> <dbl> <dbl>
1 20-01 supp      5     0
2 20-02 supp     10     0
3 20-03 supp      5     0
4 20-01 svcs      0     2
5 20-02 svcs      0     4
6 20-03 svcs      0     8

CodePudding user response:

Here is a version using bind_cols

library(dplyr)
library(tidyr)

x %>% 
  dplyr::mutate(id = row_number()) %>% 
  pivot_wider(
    names_from = type,
    values_from = cost, 
    values_fill = 0
  ) %>% 
  bind_cols(type = x$type) %>% 
  select(date, type, everything(), -id)
  date  type   supp  svcs
  <chr> <chr> <int> <int>
1 20-01 supp      5     0
2 20-02 supp     10     0
3 20-03 supp      5     0
4 20-01 svcs      0     2
5 20-02 svcs      0     4
6 20-03 svcs      0     8

CodePudding user response:

For this kind of problem we do not necessarily need data rectangling. An alternative is to use purrr::map_dfc inside dplyr::mutate together with purrr::set_names:

library(dplyr)
library(purrr)

df %>% 
  mutate(map_dfc(set_names(unique(type)), 
              ~ ifelse(.x == type, cost, 0))
  )
#>    date type cost supp scvs
#> 1 20-01 supp    5    5    0
#> 2 20-02 supp   10   10    0
#> 3 20-03 supp    5    5    0
#> 4 20-01 scvs    2    0    2
#> 5 20-02 scvs    4    0    4
#> 6 20-03 scvs    8    0    8

To simplify this and similar problems I have a package on Github. In this case we could use over together with dist_values:

library(dplyr)
library(dplyover) # https://github.com/TimTeaFan/dplyover

df %>% 
  mutate(over(dist_values(type), 
              ~ ifelse(.x == type, cost, 0))
  )

#>    date type cost scvs supp
#> 1 20-01 supp    5    0    5
#> 2 20-02 supp   10    0   10
#> 3 20-03 supp    5    0    5
#> 4 20-01 scvs    2    2    0
#> 5 20-02 scvs    4    4    0
#> 6 20-03 scvs    8    8    0

Created on 2021-12-30 by the reprex package (v0.3.0)

data:

df <- data.frame(
  date = rep(c("20-01", "20-02", "20-03"), 2),
  type = rep(c("supp", "scvs"), each = 3),
  cost = c(5, 10, 5, 2, 4, 8)
  )

Created on 2021-12-30 by the reprex package (v0.3.0)

  • Related