I have a dataframe like below,
Date | cat | cam | reg | per |
---|---|---|---|---|
22-01-05 | A | 60 | 120 | 50 |
22-01-05 | B | 20 | 100 | 20 |
22-01-08 | A | 30 | 150 | 20 |
22-01-08 | B | 30 | 100 | 30 |
But i want something like below,
Date | cam | reg | per |
---|---|---|---|
22-01-05 | 80 | 220 | 14.5 |
22-01-08 | 60 | 250 | 24 |
How to get this using R?
CodePudding user response:
I am not sure why your expected per
values are like that, but maybe you want the following:
df <- data.frame(Date = c("22-01-05", "22-01-05", "22-01-08", "22-01-08"),
cat = c("A", "B", "A", "B"),
cam = c(60,20,30,30),
reg = c(120,100,150,100),
per = c(50,20,20,30))
library(dplyr)
df %>%
group_by(Date) %>%
summarise(cam = sum(cam),
reg = sum(reg),
per = cam/reg)
#> # A tibble: 2 × 4
#> Date cam reg per
#> <chr> <dbl> <dbl> <dbl>
#> 1 22-01-05 80 220 0.364
#> 2 22-01-08 60 250 0.24
Created on 2022-07-07 by the reprex package (v2.0.1)
CodePudding user response:
you can try this, but I don't how to get the value of per ,14.5 and 24
library(dplyr)
aggregate(cbind(cam, reg) ~ Date,df,sum) %>% mutate(per = 100*(cam/reg))
A data.frame: 2 × 4
Date cam reg per
<chr> <dbl> <dbl> <dbl>
22-01-05 80 220 36.36364
22-01-08 60 250 24.00000
CodePudding user response:
Using only the package dplyr
(which is part of package tidyverse
) just do:
df %>% group_by(Date) %>% summarise(cam = sum(cam),
reg = sum(reg),
per = 100*(cam/reg))
Date cam reg per
<chr> <int> <int> <dbl>
1 22-01-05 80 220 36.4
2 22-01-08 60 250 24
The nice thing with this syntax is, you can modify and add additional variables like sum, but also like mean, median, etc. in a very clean and structured way.