I've the following dataset
Pet Shop | Year | Item | Price |
---|---|---|---|
A | 2021 | dog | 300 |
A | 2021 | dog | 250 |
A | 2021 | fish | 20 |
A | 2020 | turtle | 50 |
A | 2020 | dog | 250 |
A | 2020 | cat | 280 |
A | 2019 | rabbit | 180 |
A | 2019 | cat | 165 |
A | 2019 | cat | 270 |
B | 2021 | dog | 350 |
B | 2021 | fish | 80 |
B | 2021 | fish | 70 |
B | 2020 | cat | 220 |
B | 2020 | turtle | 90 |
B | 2020 | turtle | 80 |
B | 2020 | fish | 55 |
B | 2019 | fish | 75 |
C | 2021 | dog | 280 |
C | 2020 | cat | 260 |
C | 2020 | cat | 270 |
C | 2019 | fish | 65 |
C | 2019 | cat | 270 |
The code for the data is as follows
Pet_Shop = c(rep("A",9), rep("B",8), rep("C",5))
Item = c("Dog","Dog","Fish","Turtle","Dog","Cat","Rabbit","Cat","Cat","Dog","Fish","Fish","Cat","Turtle","Turtle","Fish","Fish","Dog","Cat","Cat","Fish","Cat")
Price = c(300,250,20,50,250,280,180,165,270,350,80,70,220,90,80,55,75,280,260,270,65,270)
Data = data.frame(Pet_Shop, Item, Price)
Does anyone here know how I can use pivot_wider or spread (or any other method) to achieve the following table? It groups the Shop by year and takes the average of the similar item of the same shop for the year. I've issues incorporating the year.
Pet Shop | Year | dog | fish | turtle | cat | rabbit |
---|---|---|---|---|---|---|
A | 2021 | Average(300,250) = 275 | 20 | NA | NA | NA |
A | 2020 | 250 | NA | 50 | 280 | NA |
A | 2019 | NA | NA | NA | 217.5 | NA |
B | 2021 | 350 | 75 | NA | NA | NA |
B | 2020 | NA | 55 | 85 | 220 | NA |
B | 2019 | NA | 75 | NA | NA | NA |
C | 2021 | 280 | NA | NA | NA | NA |
C | 2020 | NA | NA | NA | 265 | NA |
C | 2019 | NA | 60 | NA | 270 | NA |
CodePudding user response:
In pivot_wider
you may pass a function (values_fn
) to be applied to each combination of Pet_Shop
and Year
.
result <- tidyr::pivot_wider(Data, names_from = Item,
values_from = Price, values_fn = mean)
result
# Pet_Shop Year dog fish turtle cat rabbit
# <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A 2021 275 20 NA NA NA
#2 A 2020 250 NA 50 280 NA
#3 A 2019 NA NA NA 218. 180
#4 B 2021 350 75 NA NA NA
#5 B 2020 NA 55 85 220 NA
#6 B 2019 NA 75 NA NA NA
#7 C 2021 280 NA NA NA NA
#8 C 2020 NA NA NA 265 NA
#9 C 2019 NA 65 NA 270 NA
The same can also be done with data.table
dcast
-
library(data.table)
dcast(setDT(Data), Pet_Shop Year ~ Item,
value.var = "Price", fun.aggregate = mean)