If this is my dataset
Id Date time Col1 Col2 Col3
1 1/15/2001 1 -1.24 -0.23 -8.517
1 4/12/2001 2 -4.35 0.45 -1.924
2 10/18/2018 1 -6.11 -1.90 -4.465
2 10/18/2018 1 -3.18 5.69 -5.46
3 02/12/2017 2 -2.03 6.656
3 02/12/2017 2 3.78 1.747
3 29/07/2019 3 -7.418 -5.843 -3.785
My goal is to keep only rows that are unique based on :ID Date Time For same ID,Date and Time, take the average of Col1,Col2,Col3. The final dataset should look like this
Id Date time Col1 Col2 Col3
1 1/15/2001 1 -1.24 -0.23 -8.51
1 4/12/2001 2 -4.35 0.45 -1.92
2 10/18/2018 1 4.64 1.89 -4.96
3 02/12/2017 2 -2.03 3.78 4.20
3 29/07/2019 3 -7.418 -5.843 -3.785
Thanks in advance of any suggestions.
CodePudding user response:
First group and then use across
with na.rm=TRUE
:
library(dplyr)
df %>%
group_by(Id, Date, time) %>%
summarise(across(starts_with("col"), ~mean(., na.rm = TRUE)))
Id Date time Col1 Col2 Col3
<int> <chr> <int> <dbl> <dbl> <dbl>
1 1 1/15/2001 1 -1.24 -0.23 -8.52
2 1 4/12/2001 2 -4.35 0.45 -1.92
3 2 10/18/2018 1 -4.65 1.90 -4.96
4 3 02/12/2017 2 -2.03 3.78 4.20
5 3 29/07/2019 3 -7.42 -5.84 -3.78
CodePudding user response:
If you want to use base R:
> aggregate(.~Id Date time, dat, mean, na.rm=TRUE, na.action = NULL)
Id Date time Col1 Col2 Col3
1 1 1/15/2001 1 -1.240 -0.230 -8.5170
2 2 10/18/2018 1 -4.645 1.895 -4.9625
3 3 02/12/2017 2 -2.030 3.780 4.2015
4 1 4/12/2001 2 -4.350 0.450 -1.9240
5 3 29/07/2019 3 -7.418 -5.843 -3.7850