Averaging multiple named rows into an averaged value row-CodePudding

I am looking to find a way to average common-names in a certain column, and based off of that similarity, to average the rest of the values into a common row. I looked into aggregate, but I think I am missing how to do this in a large dataset, since it isnt working to how I want it to work.

Say i have DF1

ID	Location	Value2	Value3	Value4
First	2	4	3	3
Second	1	2	5	2
Second	2	4	5	6
Third	6	1	8	8
Third	4	7	5	4
Fourth	1	4	5	1

I would want to average any duplicate values

What is the best way to do that when there is a large number of columns, and this is only a partial number of them? Any help would be appreciated. Thanks!

ID <- c("First", "Second", "Second", "Third", "Third", "Fourth")
Location <- c(2,1,2,6,4,1)
Value2 <- c(4,2,4,1,7,4)
Value3 <- c(3,5,5,8,5,5)
Value4 <- c(3,2,6,8,4,1)
DF <- data.frame(ID, Location, Value2, Value3, Value4)

CodePudding user response：

You can try this using aggregate

aggregate(df[-1], list(df$ID), mean)
  Group.1 Location Value2 Value3 Value4
1   First      2.0      4    3.0      3
2  Fourth      1.0      4    5.0      1
3  Second      1.5      3    5.0      4
4   Third      5.0      4    6.5      6

With dplyr

library(dplyr)

df %>%
  group_by(ID) %>%
  summarise(across(starts_with(c("Lo","Val")),mean))
# A tibble: 4 × 5
  ID     Location Value2 Value3 Value4
  <chr>     <dbl>  <dbl>  <dbl>  <dbl>
1 First       2        4    3        3
2 Fourth      1        4    5        1
3 Second      1.5      3    5        4
4 Third       5        4    6.5      6

Data

df <- structure(list(ID = c("First", "Second", "Second", "Third", "Third", 
"Fourth"), Location = c(2L, 1L, 2L, 6L, 4L, 1L), Value2 = c(4L, 
2L, 4L, 1L, 7L, 4L), Value3 = c(3L, 5L, 5L, 8L, 5L, 5L), Value4 = c(3L, 
2L, 6L, 8L, 4L, 1L)), class = "data.frame", row.names = c(NA, 
-6L))