Home > Software engineering >  R loop over nominal list and integers
R loop over nominal list and integers

Time:01-10

I have a dataset where I have been able to loop over different test values with dpois. For simplicity's sake, I have used an average of 4 events per month and I wanted to know what is the likelihood of n or more events, given the average. Here is what I have managed to make work:

MonthlyAverage <- 4
cnt <- c(0:10)
for (i in cnt) {
    CountProb <- ppois(cnt,MonthlyAverage,lower.tail=FALSE)
}
dfProb <- data.frame(cnt,CountProb)

I am interested in investigating this to figure out how many events I may expect each month given the mean of that month.

enter image description here

I would be looking to say:

For January, what is the probability of 0

For January, what is the probability of 1

For January, what is the probability of 2

etc...

For February, what is the probability of 0

For February, what is the probability of 1

For February, what is the probability of 2

etc.

To give something like (numbers here are just an example):

enter image description here

I thought about trying one loop to select the correct month and then remove the month column so I am just left with the single "Monthly Average" value and then performing the count loop, but that doesn't seem to work. I still get "Non-numeric argument to mathematical function". I feel like I'm close, but can anyone please point me in the right direction for the formatting?

CodePudding user response:

How about something like this:

cnt <- 0:10
MonthlyAverage <- c(1.8, 1.56, 2.44, 1.86, 2.1, 2.3, 2, 2.78, 1.89, 1.86, 1.4, 1.71)

grid <- expand.grid(cnt =cnt, m_num = 1:12)
grid$MonthlyAverage <- MonthlyAverage[grid$m_num]
mnames <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
grid$month <- mnames[grid$m_num]
grid$prob <- ppois(grid$cnt, grid$MonthlyAverage, lower.tail=FALSE)
grid[,c("month", "cnt", "prob")]
#>     month cnt         prob
#> 1     Jan   0 8.347011e-01
#> 2     Jan   1 5.371631e-01
#> 3     Jan   2 2.693789e-01
#> 4     Jan   3 1.087084e-01
#> 5     Jan   4 3.640666e-02
#> 6     Jan   5 1.037804e-02
#> 7     Jan   6 2.569450e-03
#> 8     Jan   7 5.615272e-04
#> 9     Jan   8 1.097446e-04
#> 10    Jan   9 1.938814e-05
#> 11    Jan  10 3.123964e-06
#> 12    Feb   0 7.898639e-01
#> 13    Feb   1 4.620517e-01
#> 14    Feb   2 2.063581e-01
#> 15    Feb   3 7.339743e-02
#> 16    Feb   4 2.154277e-02
#> 17    Feb   5 5.364120e-03
#> 18    Feb   6 1.157670e-03
#> 19    Feb   7 2.202330e-04
#> 20    Feb   8 3.743272e-05
#> 21    Feb   9 5.747339e-06
#> 22    Feb  10 8.044197e-07
#> 23    Mar   0 9.128391e-01
#> 24    Mar   1 7.001667e-01
#> 25    Mar   2 4.407062e-01
#> 26    Mar   3 2.296784e-01
#> 27    Mar   4 1.009515e-01
#> 28    Mar   5 3.813271e-02
#> 29    Mar   6 1.258642e-02
#> 30    Mar   7 3.681711e-03
#> 31    Mar   8 9.657751e-04
#> 32    Mar   9 2.294546e-04
#> 33    Mar  10 4.979244e-05
#> 34    Apr   0 8.443274e-01
#> 35    Apr   1 5.547763e-01
#> 36    Apr   2 2.854938e-01
#> 37    Apr   3 1.185386e-01
#> 38    Apr   4 4.090445e-02
#> 39    Apr   5 1.202455e-02
#> 40    Apr   6 3.071778e-03
#> 41    Apr   7 6.928993e-04
#> 42    Apr   8 1.398099e-04
#> 43    Apr   9 2.550478e-05
#> 44    Apr  10 4.244028e-06
#> 45    May   0 8.775436e-01
#> 46    May   1 6.203851e-01
#> 47    May   2 3.503686e-01
#> 48    May   3 1.613572e-01
#> 49    May   4 6.212612e-02
#> 50    May   5 2.044908e-02
#> 51    May   6 5.862118e-03
#> 52    May   7 1.486029e-03
#> 53    May   8 3.373058e-04
#> 54    May   9 6.927041e-05
#> 55    May  10 1.298297e-05
#> 56    Jun   0 8.997412e-01
#> 57    Jun   1 6.691458e-01
#> 58    Jun   2 4.039612e-01
#> 59    Jun   3 2.006529e-01
#> 60    Jun   4 8.375072e-02
#> 61    Jun   5 2.997569e-02
#> 62    Jun   6 9.361934e-03
#> 63    Jun   7 2.588841e-03
#> 64    Jun   8 6.415773e-04
#> 65    Jun   9 1.439431e-04
#> 66    Jun  10 2.948727e-05
#> 67    Jul   0 8.646647e-01
#> 68    Jul   1 5.939942e-01
#> 69    Jul   2 3.233236e-01
#> 70    Jul   3 1.428765e-01
#> 71    Jul   4 5.265302e-02
#> 72    Jul   5 1.656361e-02
#> 73    Jul   6 4.533806e-03
#> 74    Jul   7 1.096719e-03
#> 75    Jul   8 2.374473e-04
#> 76    Jul   9 4.649808e-05
#> 77    Jul  10 8.308224e-06
#> 78    Aug   0 9.379615e-01
#> 79    Aug   1 7.654944e-01
#> 80    Aug   2 5.257652e-01
#> 81    Aug   3 3.036162e-01
#> 82    Aug   4 1.492226e-01
#> 83    Aug   5 6.337975e-02
#> 84    Aug   6 2.360590e-02
#> 85    Aug   7 7.809999e-03
#> 86    Aug   8 2.320924e-03
#> 87    Aug   9 6.254093e-04
#> 88    Aug  10 1.540564e-04
#> 89    Sep   0 8.489282e-01
#> 90    Sep   1 5.634025e-01
#> 91    Sep   2 2.935807e-01
#> 92    Sep   3 1.235929e-01
#> 93    Sep   4 4.327373e-02
#> 94    Sep   5 1.291307e-02
#> 95    Sep   6 3.349459e-03
#> 96    Sep   7 7.672845e-04
#> 97    Sep   8 1.572459e-04
#> 98    Sep   9 2.913775e-05
#> 99    Sep  10 4.925312e-06
#> 100   Oct   0 8.443274e-01
#> 101   Oct   1 5.547763e-01
#> 102   Oct   2 2.854938e-01
#> 103   Oct   3 1.185386e-01
#> 104   Oct   4 4.090445e-02
#> 105   Oct   5 1.202455e-02
#> 106   Oct   6 3.071778e-03
#> 107   Oct   7 6.928993e-04
#> 108   Oct   8 1.398099e-04
#> 109   Oct   9 2.550478e-05
#> 110   Oct  10 4.244028e-06
#> 111   Nov   0 7.534030e-01
#> 112   Nov   1 4.081673e-01
#> 113   Nov   2 1.665023e-01
#> 114   Nov   3 5.372525e-02
#> 115   Nov   4 1.425330e-02
#> 116   Nov   5 3.201149e-03
#> 117   Nov   6 6.223149e-04
#> 118   Nov   7 1.065480e-04
#> 119   Nov   8 1.628881e-05
#> 120   Nov   9 2.248494e-06
#> 121   Nov  10 2.828495e-07
#> 122   Dec   0 8.191342e-01
#> 123   Dec   1 5.098537e-01
#> 124   Dec   2 2.454189e-01
#> 125   Dec   3 9.469102e-02
#> 126   Dec   4 3.025486e-02
#> 127   Dec   5 8.217692e-03
#> 128   Dec   6 1.937100e-03
#> 129   Dec   7 4.028407e-04
#> 130   Dec   8 7.489285e-05
#> 131   Dec   9 1.258275e-05
#> 132   Dec  10 1.927729e-06

Created on 2023-01-09 by the reprex package (v2.0.1)

CodePudding user response:

a "tidy-style" solution:

library(tidyr)
library(dplyr)

## example data:
df <- data.frame(Month = c('Jan', 'Feb'),
                 MonthlyAverage = c(5, 2)
                 )
> df
  Month MonthlyAverage
1   Jan              5
2   Feb              2
df |>
  mutate(n = list(1:10)) |>
  unnest_longer(n) |>
  mutate(CountProb = ppois(n, MonthlyAverage,
                           lower.tail=FALSE
                           )
         )
# A tibble: 20 x 4
   Month MonthlyAverage     n  CountProb
   <chr>          <dbl> <int>      <dbl>
 1 Jan                5     1 0.960     
 2 Jan                5     2 0.875     
 3 Jan                5     3 0.735     
 4 Jan                5     4 0.560     
 5 Jan                5     5 0.384     
 6 Jan                5     6 0.238     
## ...

CodePudding user response:

If you have each month's mean, in base R you could easily use sapply to estimate the probability of obtaining values 0 to 10 using each month's mean value. Then you can simply combine it in a data frame:

# Data
df <- data.frame(month = month.name,
                 mean = c(1.8, 2.8, 1.7, 1.6, 1.8, 2, 
                          2.3, 2.4, 2.1, 1.4, 1.9, 1.9))

probs <- sapply(1:12, function(x) ppois(0:10, df$mean[x], lower.tail = FALSE))

finaldata <- data.frame(month = rep(month.name, each = 11), 
           events = rep(0:10, times = 12),
           prob = prob = as.vector(probs))

Output:

#         month events         prob
# 1     January      0 8.347011e-01
# 2     January      1 5.371631e-01
# 3     January      2 2.693789e-01
# 4     January      3 1.087084e-01
# 5     January      4 3.640666e-02
# 6     January      5 1.037804e-02
# 7     January      6 2.569450e-03
# 8     January      7 5.615272e-04
# 9     January      8 1.097446e-04
# 10    January      9 1.938814e-05
# 11    January     10 3.123964e-06
# 12   February      0 9.391899e-01
# 13   February      1 7.689218e-01
# 14   February      2 5.305463e-01
# 15   February      3 3.080626e-01
# ...
# 131  December      9 3.044317e-05
# 132  December     10 5.172695e-06
  • Related