I'm working on a project where I need to find the distance between a bunch of behaviors that are measured in 3-dimensional space and a pre-identified point in 3-dimensional space. I wrote a function to calculate the distance between the point and a single behavior, which works when I apply it to only one behavior. However, I need to apply it to ~750 behaviors in a larger data frame. So I am hoping to nest the larger behaviors data frame by term and then apply the function to each one of those nested dataframes using map_dbl. However, I keep getting the error:
Error: Problem with mutate()
column distance
.
ℹ distance = map_dbl(data, calc_distance_from_beh)
.
x Join columns must be present in data.
x Problem with dim
.
ℹ The error occurred in row 1.
It seems like something is happening when map_dbl is being applied to the nested dataframes where it isn't able to access the "dim" column to join on and I'm not sure why.
I've included a reproducible example below with just two behaviors.
Reproducible example:
behaviors <- tibble(term = rep(c("abandon", "abet"), each = 3),
estimate = c(-3.31, -0.08, -0.11, 0.03, 0.34, -0.18),
dim = c("E", "P", "A", "E", "P", "A"))
optimal_behavior <- tibble(actor = "civil_engineer",
object = "civil_engineer",
opt_beh = c(1.905645, 0.9960085, -0.17772678),
dim = c("E", "P", "A"))
calc_distance_from_beh <- function(nested_df){
optimal_behavior <- as_tibble(optimal_behavior)
nested_df <- as_tibble(nested_df)
df_for_calculations <- left_join(optimal_behavior, nested_df, by = "dim")
df_for_calculations %>%
mutate(dist = (estimate-opt_beh)^2) %>%
summarise(total_dist = sum(dist)) %>%
pull()
}
behaviors_distance <- behaviors %>%
nest_by(term) %>%
mutate(distance = map_dbl(data, calc_distance_from_beh))
CodePudding user response:
If the 'value' column is named as estimate
, just ungroup
after the nest_by
(because nest_by
creates a rowwise
attribute which prevents the map
to access each element)
library(purrr)
library(dplyr)
behaviors %>%
nest_by(term) %>%
ungroup %>%
mutate(distance = map_dbl(data, calc_distance_from_beh))
# A tibble: 2 × 3
term data distance
<chr> <list<tibble[,2]>> <dbl>
1 abandon [3 × 2] 28.4
2 abet [3 × 2] 3.95
Or instead of map
, we may directly apply the function in mutate
as it is rowwise
behaviors %>%
nest_by(term) %>%
mutate(distance = calc_distance_from_beh(data)) %>%
ungroup
-output
# A tibble: 2 × 3
term data distance
<chr> <list<tibble[,2]>> <dbl>
1 abandon [3 × 2] 28.4
2 abet [3 × 2] 3.95