I'm not really great at loops but I'm trying to get better at working through them. I am using tidycensus to select and pull in a few variables throughout the year (dummy data in example below is representative). So, for a given set of selected variables (dv_acs), I want to pull the information in the comprehensive codebook that you can download through load_variables for every year and then full_join them. In most cases, this would be the same information throughout the years, but I want to have this complete so I can double check it and note any discrepancies.
Here is the setup, which is working:
library(tidycensus)
library(dplyr)
#getting codebook for all ACS years for every single variable possible
for(x in c(2009:2020)) {
filename <- paste0("v", x)
assign(filename, (load_variables(x, "acs5", cache = TRUE)))
}
#selecing and recoding variables to pull in
dv_acs = c(
hus = "B25002_001",
husocc = "B25002_002",
husvac = "B25002_003"
)
This is accomplishing what I want a year at a time, from which I could just do a full bind piece by piece
#creating a codebook a year at a time for variables I'm interested in
codebook <- v2009 %>%
filter(name %in% dv_acs) %>%
mutate(id = names(dv_acs), .before = 1)
colnames(codebook) = c("id", "name", "label_2009", "concept_2009")
codebook2 <- v2010 %>%
filter(name %in% dv_acs) %>%
mutate(id = names(dv_acs), .before = 1)
colnames(codebook2) = c("id", "name", "label_2010", "concept_2010")
codebook <- full_join(codebook, codebook2, by=c("id", "name"))
And here is where I try and fail to make a loop to create the codebook for my specific variables throughout the year all in one go:
#creating a loop to pull in an join a codebook for all years
for(x in c(2009:2010)){
codebook <- data.frame(matrix(ncol = 2, nrow = 0)) #create a master file I can join the the files to as they load in through the loop
colnames(codebook) <- c("id", "name") #giving right label names
filename <- paste0("v", x) #this is where I'm starting to have trouble; this saves as a value, and I can't then use it to call the dataframe
temp <- filename %>% (name %in% dv_acs) %>%
mutate(id = names(dv_acs), .before = 1)
colnames(temp) <- c("id", "name", paste0("label_", x), paste0("concept_", x))
codebook <- full_join(codebook, temp, by=c("id", "name"))
}
Reported error is: "Error in name %in% dv_acs : object 'name' not found"
CodePudding user response:
It is better to not create objects in global environment. Instead, it could be stored in a list
. Here, the values of the objects can be retrieved with mget
library(stringr)
library(purrr)
library(dplyr)
out <- mget(str_c("v", 2009:2020)) %>%
imap(~ {
nm <- str_c(c("label", "concept"), str_remove(.y, "v"))
.x %>%
select(-any_of("geography")) %>%
filter(name %in% dv_acs) %>%
mutate(id = names(dv_acs), .before = 1) %>%
rename_with(~ nm, c("label", "concept"))
}) %>%
reduce(full_join)
-output
> out
# A tibble: 3 × 26
id name label…¹ conce…² label…³ conce…⁴ label…⁵ conce…⁶ label…⁷ conce…⁸ label…⁹ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 hus B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
2 huso… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
3 husv… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
# … with 5 more variables: concept2018 <chr>, label2019 <chr>, concept2019 <chr>, label2020 <chr>, concept2020 <chr>, and abbreviated variable names ¹label2009,
# ²concept2009, ³label2010, ⁴concept2010, ⁵label2011, ⁶concept2011, ⁷label2012, ⁸concept2012, ⁹label2013, ˟concept2013, ˟label2014, ˟concept2014, ˟label2015,
# ˟concept2015, ˟label2016, ˟concept2016, ˟label2017, ˟concept2017, ˟label2018
If we want everything in the list
without having to create objects in the global env
out <- map(2009:2020, ~ {
nm <- str_c(c("label", "concept"), "_", .x)
load_variables(.x, "acs5") %>%
select(-any_of("geography")) %>%
filter(name %in% dv_acs) %>%
mutate(id = names(dv_acs), .before = 1) %>%
rename_with(~ nm, c("label", "concept"))
}) %>%
reduce(full_join)
-output
> out
# A tibble: 3 × 26
id name label…¹ conce…² label…³ conce…⁴ label…⁵ conce…⁶ label…⁷ conce…⁸ label…⁹ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟ conce…˟ label…˟
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 hus B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
2 huso… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
3 husv… B250… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima… OCCUPA… Estima…
# … with 5 more variables: concept_2018 <chr>, label_2019 <chr>, concept_2019 <chr>, label_2020 <chr>, concept_2020 <chr>, and abbreviated variable names
# ¹label_2009, ²concept_2009, ³label_2010, ⁴concept_2010, ⁵label_2011, ⁶concept_2011, ⁷label_2012, ⁸concept_2012, ⁹label_2013, ˟concept_2013, ˟label_2014,
# ˟concept_2014, ˟label_2015, ˟concept_2015, ˟label_2016, ˟concept_2016, ˟label_2017, ˟concept_2017, ˟label_2018
# ℹ Use `colnames()` to see all variable names