Does purrr::map change an object type?-CodePudding

I've noticed something very strange while doing some regression analysis. Essentially, when I estimate a regression independently and that same regression within a purrr::map function and extract the element, I get that these two objects are not identical. My question is why this is the case or IF this SHOULD be the case.

The main reason I ask this is because some packages are having issues pulling information from estimations that are extracted from purrr::map, but not when I estimate them individually. Here is a small example with some nonsensical regressions:

library(fixest)
library(tidyverse)

## creating a formula for a regression example
formula <- as.formula(paste0(
  "mpg", "~",
  paste("cyl", collapse = " "),
  paste("|"), paste(c("gear", "carb"), collapse = " ")))

## estimating the regression and saying
mtcars_formula <- feols(formula, cluster = "gear", data = mtcars)

## estimating the same regression twice, but using map
mtcars_list_map <- map(list("gear", "gear"), ~ feols(formula, cluster = ., data = mtcars))

## extracting the first element of the list
is_identical_1 <- mtcars_list_map %>% 
  pluck(1)


## THESE ARE NOT IDENTIAL
identical(mtcars_formula, is_identical_1)

I am tagging this with fixest package as well, only because this may be package specific...

CodePudding user response：

The differences largely come down to differences in environment. For example, the third element of these lists (i.e. of mtcars_formula and is_identical_1) is the formula mpg~cyl (and in fact mtcars_formula[[3]] == is_identical_1[[3]] will return TRUE. However, you will see that these are associated with differing environments.

> mtcars_formula[[3]] == is_identical_1[[3]]
[1] TRUE
> environment(mtcars_formula[[3]])
<environment: 0x560a2490ef40>
> environment(is_identical_1[[3]])
<environment: 0x560a2554d810>

Whether or not you consider these differences "trivial" or not depends on your use case, but you can check the differences like this:

differences =list()
for(i in 1:length(mtcars_formula)) {
  if(!identical(mtcars_formula[[i]], is_identical_1[[i]])) {
    differences[[names(mtcars_formula)[i]]] = list(mtcars_formula[[i]], is_identical_1[[i]])
  }
}

One element that is indeed different is the reported call (the 4th element)

> mtcars_formula[[4]] == is_identical_1[[4]]
[1] FALSE
> c(mtcars_formula[[4]], is_identical_1[[4]])
[[1]]
feols(fml = formula, data = mtcars, cluster = "gear")

[[2]]
feols(fml = formula, data = mtcars, cluster = .)

This may have something to do with the error you report in the comments above, associated with fwildclusterboot::boottest(). Note that the call from the object created using map() indicates the cluster=., instead of `cluster="gear".

One way to get around this would be to do something like this:

mtcars_list_map <- map(list("gear", "gear"), function(x) {
  # create the model
  model = feols(formula, cluster = x, data = mtcars)
  # manipulate the call object
  model$call$cluster=x
  # return the model
  model
})