I am looking to merge all rows corresponding to the same year and location into one that represents the average.
Let's say this is my data frame
data<-data.frame(year=c(2000,2000,2000,2000,2001,2001,2001,2001,2002,2002,2002,2002),
location=c(1,1,2,2,1,1,2,2,1,1,2,2),
x=c(1,2,3,4,5,6,7,8,9,10,11,12))
I want to merge all rows that are representative of the same year (f.e. 2000) and the same location (f.e. 1) into one. The x-value of this new row should be the average of the x-values of the merged rows.
Unfortunately I have no idea how to do this and haven't been able to find a way in documentation or online.
CodePudding user response:
Looking for somethin like this...?
library(dplyr)
data %>%
group_by(year, location) %>%
summarise(x_mean = mean(x)) %>%
ungroup()
`summarise()` has grouped output by 'year'. You can override using the `.groups` argument.
# A tibble: 6 × 3
year location x_mean
<dbl> <dbl> <dbl>
1 2000 1 1.5
2 2000 2 3.5
3 2001 1 5.5
4 2001 2 7.5
5 2002 1 9.5
6 2002 2 11.5
CodePudding user response:
Here is a base R solution
aggregate(data, list(data$year, data$location), mean)[, -c(1:2)]
year location x
1 2000 1 1.5
2 2001 1 5.5
3 2002 1 9.5
4 2000 2 3.5
5 2001 2 7.5
6 2002 2 11.5
or dplyr
library(dplyr)
data %>%
group_by(year, location) %>%
summarize(x = mean(x), .groups = "drop")
# A tibble: 6 × 3
year location x
<dbl> <dbl> <dbl>
1 2000 1 1.5
2 2000 2 3.5
3 2001 1 5.5
4 2001 2 7.5
5 2002 1 9.5
6 2002 2 11.5