I'm just going to chalk this up to my ignorance, but sometimes the pivot_* functions drive me crazy.
I have a tibble:
# A tibble: 12 x 3
year term estimate
<dbl> <chr> <dbl>
1 2018 intercept -29.8
2 2018 daysuntilelection 8.27
3 2019 intercept -50.6
4 2019 daysuntilelection 7.40
5 2020 intercept -31.6
6 2020 daysuntilelection 6.55
7 2021 intercept -19.0
8 2021 daysuntilelection 4.60
9 2022 intercept -10.7
10 2022 daysuntilelection 6.41
11 2023 intercept 120
12 2023 daysuntilelection 0
that I would like to flip to:
# A tibble: 6 x 3
year intercept daysuntilelection
<dbl> <dbl> <dbl>
1 2018 -29.8 8.27
2 2019 -50.6 7.40
3 2020 -31.6 6.55
4 2021 -19.0 4.60
5 2022 -10.7 6.41
6 2023 120 0
Normally pivot_wider
should be able to do this as x %>% pivot_wider(!year, names_from = "term", values_from = "estimate")
but instead it returns a two-column tibble with lists and a bunch of warning.
# A tibble: 1 x 2
intercept daysuntilelection
<list> <list>
1 <dbl [6]> <dbl [6]>
Warning message:
Values from `estimate` are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = {summary_fun}` to summarise duplicates.
* Use the following dplyr code to identify duplicates.
{data} %>%
dplyr::group_by(term) %>%
dplyr::summarise(n = dplyr::n(), .groups = "drop") %>%
dplyr::filter(n > 1L)
Where do I go wrong here? Help!
CodePudding user response:
Next to the solutions offered in the comments, data.table
's dcast
is a very fast implementation to pivot your data. If the pivot_ functions drive you crazy, maybe this is a nice alternative for you:
x <- read.table(text = "
1 2018 intercept -29.8
2 2018 daysuntilelection 8.27
3 2019 intercept -50.6
4 2019 daysuntilelection 7.40
5 2020 intercept -31.6
6 2020 daysuntilelection 6.55
7 2021 intercept -19.0
8 2021 daysuntilelection 4.60
9 2022 intercept -10.7
10 2022 daysuntilelection 6.41
11 2023 intercept 120
12 2023 daysuntilelection 0")
names(x) <- c("id", "year", "term", "estimate")
library(data.table)
dcast(as.data.table(x), year ~ term)
#> Using 'estimate' as value column. Use 'value.var' to override
#> year daysuntilelection intercept
#> 1: 2018 8.27 -29.8
#> 2: 2019 7.40 -50.6
#> 3: 2020 6.55 -31.6
#> 4: 2021 4.60 -19.0
#> 5: 2022 6.41 -10.7
#> 6: 2023 0.00 120.0
CodePudding user response:
DATA
df <- read.table(text = "
1 2018 intercept -29.8
2 2018 daysuntilelection 8.27
3 2019 intercept -50.6
4 2019 daysuntilelection 7.40
5 2020 intercept -31.6
6 2020 daysuntilelection 6.55
7 2021 intercept -19.0
8 2021 daysuntilelection 4.60
9 2022 intercept -10.7
10 2022 daysuntilelection 6.41
11 2023 intercept 120
12 2023 daysuntilelection 0")
CODE
library(tidyverse)
df %>%
pivot_wider(names_from = V3,values_from = V4 , values_fill = 0) %>%
group_by(V2) %>%
summarise_all(sum,na.rm=T)
OUTPUT
V2 V1 intercept daysuntilelection
<int> <int> <dbl> <dbl>
1 2018 3 -29.8 8.27
2 2019 7 -50.6 7.4
3 2020 11 -31.6 6.55
4 2021 15 -19 4.6
5 2022 19 -10.7 6.41
6 2023 23 120 0