Based on the data below how can I average the columns InIndividuals
, OutIndividuals
, InAGI
and OutAGI
by FIPS
? I am using FIPS
because there are several counties with the same name. So the column Key
is no longer needed as it's a concatenate of FIPS
and Year
.
Desired output schema/columns:
County State FIPS AvgInIndividuals AvgOutIndividuals AvgInAGI AvgOutAGI
Sample data:
df = structure(list(FIPS = c(12001L, 8001L, 16001L, 12001L,
8001L, 16001L), State = c("FL", "CO", "ID", "FL",
"CO", "ID"), County = c("Alachua County", "Adams County",
"Ada County", "Alachua County", "Adams County", "Ada County"),
InIndividuals = c(433L, 30L, 16L, 381L, 42L, 21L), OutIndividuals = c(426L, 33L, 12L, 382L, 47L, 25L), InAGI = c(111L, 222L, 333L, 444L, 555L, 666L), NetMigration = c(7L, -3L, 4L, -1L, -5L, -4L), OutAGI = c(570L, 246L, 135L, 123L, 456L, 789L), FiscalYear = c("2011 - 2012",
"2011 - 2012", "2011 - 2012", "2011 - 2012", "2010 - 2011",
"2010 - 2011"), Year = c(2012L, 2012L, 2012L, 2011L, 2011L,
2011L), Key = c(120012012L, 80012012L, 160012012L, 120012011L,
80012011L, 160012011L)), class = "data.frame", row.names = c(NA,
-6L))
CodePudding user response:
Try
library(dplyr)
df1 %>%
group_by(Key, County = Origin_Place,
FIPS = Origin_FIPS, State = Origin_StateName) %>%
summarise(across(matches('(Individuals|AGI)$'), mean,
na.rm = TRUE, .names = "Avg{.col}"), .groups = "drop")
-output
# A tibble: 6 × 8
Key County FIPS State AvgInIndividuals AvgOutIndividuals AvgInAGI AvgOutAGI
<int> <chr> <int> <chr> <dbl> <dbl> <dbl> <dbl>
1 80012011 Adams County 8001 CO 42 47 555 456
2 80012012 Adams County 8001 CO 30 33 222 246
3 120012011 Alachua County 12001 FL 381 382 444 123
4 120012012 Alachua County 12001 FL 433 426 111 570
5 160012011 Ada County 16001 ID 21 25 666 789
6 160012012 Ada County 16001 ID 16 12 333 135
CodePudding user response:
aggregate(cbind(InIndividuals, OutIndividuals, InAGI, OutAGI) ~ Key Origin_Place Origin_StateName Origin_FIPS,
data = df,
mean) |>
`colnames<-`(c('Key', 'County', 'State', 'FIPS', 'AvgOutIndividuals', 'AvgInIndividuals', 'AvgInAGI', 'AvgOutAGI'))