Home > Net >  `vec_as_location2_result()`: Error in loop, R
`vec_as_location2_result()`: Error in loop, R

Time:01-06

I have written a function to output a stat description and histogram with a standard deviation curve for a column in a dataframe. I now want to use a loop to run this for all the columns in the dataframe, however I have gotten the below warning -

Warning: numerical expression has 99 elements: only the first usedWarning: numerical expression has 99 elements: only the first usedError in vec_as_location2_result(): ! Can't extract columns past the end. ℹ Location 849 doesn't exist. ℹ There are only 25 columns. Backtrace:

  1. global normality_test(kensington_data_plus_consumption, i)
  2. dplyr:::pull.data.frame(...)
  3. tidyselect::vars_pull(names(.data), !!enquo(var))
  4. tidyselect:::pull_as_location2(loc, n, vars)
  5. vctrs::num_as_location2(i, n = n, negative = "ignore", arg = "var")
  6. vctrs:::vec_as_location2_result(...) Error in vec_as_location2_result(i, n = n, names = NULL, negative = negative, :

Here is the code for the function and loop - Libraries used - tidyverse, ggplot, pastecs

test_data <- data.frame (a = c("E01002852", "E01002853", "E01002854", "E01002855", "E01002856", "E01002857", "E01002858"),
                        b = c(998, 715, 523, 755, 694, 510, 661),
                        c = c(2645303, 1844769, 1371527, 1853285, 2017993, 1492991, 1937841),
                        d = c(2659.604, 2580.096, 2622.423, 2907.771, 2927.434, 2931.681, 3357.934),
                        e = c(2004.55, 2121.30, 2100.10, 1942.30, 2285.55, 2103.50, 1999.20),
                        f = c(706, 319, 309, 644, 404, 443, 567)
)

normality_test <- function(data, col) {
  col <- data %>% pull({{col}})
  col_stat <-
   stat.desc(col,
      basic = FALSE,
      desc = FALSE,
      norm = TRUE
    )
   print(col_stat)
   
  data %>%
  ggplot(
    aes(
      x = col
    )
  )  
    geom_histogram(
      aes(
        y = ..density..
      ),
      binwidth = 15
    )  
    stat_function(
      fun = dnorm,
      args = list(
        mean = col %>% mean(),
        sd = col %>% sd()
      ),
      colour = "red", size = 1
    )
  }

for (i in test_data$b:test_data$f) {
   normality_test(test_data, i)
  }

CodePudding user response:

You are initiating your loop incorrectly. If you just run:

test_data$b:test_data$f

You will see it gives an error, so cant initiate the list in the first place.

Warning messages: 1: In test_data$b:test_data$f : numerical expression has 7 elements: only the first used 2: In test_data$b:test_data$f : numerical expression has 7 elements: only the first used

You can first define your columns, then run the loop:

wantcols <- c("b", "c", "d", "e", "f")

for (i in wantcols) {
  normality_test(test_data, i)
}

In this case, i will iteratively take the value of each element in wantcols.

Alternatively, as the comments mention, you could accomplish this more simply with lapply:

lapply(wantcols, function(x) normality_test(test_data, x))

Also, if you want all the columns in your data but the first, you could do something easier to define your columns, such as:

wantcols <- names(test_data)[-1]
# [1] "b" "c" "d" "e" "f"
  • Related