I'm trying to create box plots of the total revenue for each region and cannot figure out how to create.
Here is my head(df):
> head(df2)
store city region province size revenue units cost gross_profit promo_units energy_units regularBars_units
1 105 BROCKVILLE ONTARIO ON 496 984.70 470.46 590.73 393.97 210.23 72.13 38.63
2 117 BURLINGTON ONTARIO ON 875 2629.32 1131.38 1621.58 1007.74 401.46 192.77 75.04
3 122 BURLINGTON ONTARIO ON 691 2786.73 1229.46 1709.45 1077.27 450.04 240.48 93.73
4 123 BURLINGTON ONTARIO ON 763 2834.49 1257.63 1719.61 1114.88 476.83 194.21 99.44
5 182 DON MILLS ONTARIO ON 784 4118.36 1949.50 2485.83 1632.53 664.71 199.73 175.48
7 186 NORTH YORK ONTARIO ON 966 8195.26 3695.46 5069.99 3125.27 1143.33 419.19 271.58
gum_units bagpegCandy_units isotonics_units singleServePotato_units takeHomePotato_units kingBars_units flatWater_units
1 29.29 13.38 20.69 18.60 7.71 17.87 56.54
2 55.85 42.15 87.62 36.44 33.46 47.44 98.42
3 64.27 29.85 105.65 47.96 19.90 45.21 130.27
4 73.25 54.15 118.19 39.67 22.10 45.33 132.77
5 145.81 68.06 109.35 85.71 42.33 79.81 204.06
7 212.42 153.90 166.37 130.79 136.79 114.50 328.63
psd591Ml_units
1 39.71
2 38.73
3 47.31
4 39.87
5 50.29
7 112.38
We are only concerned about region and revenue here and trying to create box plots for the revenue of each region.
Here is my str(df2)
> str(df2)
'data.frame': 755 obs. of 20 variables:
$ store : int 105 117 122 123 182 186 194 227 233 236 ...
$ city : chr "BROCKVILLE" "BURLINGTON" "BURLINGTON" "BURLINGTON" ...
$ region : chr "ONTARIO" "ONTARIO" "ONTARIO" "ONTARIO" ...
$ province : chr "ON" "ON" "ON" "ON" ...
$ size : int 496 875 691 763 784 966 710 973 967 1001 ...
$ revenue : num 985 2629 2787 2834 4118 ...
$ units : num 470 1131 1229 1258 1950 ...
$ cost : num 591 1622 1709 1720 2486 ...
$ gross_profit : num 394 1008 1077 1115 1633 ...
$ promo_units : num 210 401 450 477 665 ...
$ energy_units : num 72.1 192.8 240.5 194.2 199.7 ...
$ regularBars_units : num 38.6 75 93.7 99.4 175.5 ...
$ gum_units : num 29.3 55.9 64.3 73.2 145.8 ...
$ bagpegCandy_units : num 13.4 42.1 29.9 54.1 68.1 ...
$ isotonics_units : num 20.7 87.6 105.7 118.2 109.3 ...
$ singleServePotato_units: num 18.6 36.4 48 39.7 85.7 ...
$ takeHomePotato_units : num 7.71 33.46 19.9 22.1 42.33 ...
$ kingBars_units : num 17.9 47.4 45.2 45.3 79.8 ...
$ flatWater_units : num 56.5 98.4 130.3 132.8 204.1 ...
$ psd591Ml_units : num 39.7 38.7 47.3 39.9 50.3 ...
- attr(*, "na.action")= 'omit' Named int [1:16] 6 169 173 177 182 191 193 195 196 198 ...
..- attr(*, "names")= chr [1:16] "6" "169" "173" "177" ...
CodePudding user response:
Have you tried
boxplot(revenue ~ region, data = df2)
?
CodePudding user response:
In ggplot2
:
library(ggplot2)
df2 |> ggplot(aes(region, revenue)) geom_boxplot()
More info and examples here: https://ggplot2.tidyverse.org/reference/geom_boxplot.html