Home > front end >  Resampling and replacement while sequentially adding levels of a factor
Resampling and replacement while sequentially adding levels of a factor

Time:04-06

I am trying to run a function while sequentially adding sites (x i) to a dataframe, which would result in the statistic plus the confidence intervals. For example, if I want to run a linear model with which I sequentially add a site to every iteration to better understand how the additional data from every site influences the fit. However, I want to include every possible site in each iteration to obtain the confidence interval for each iteration. In its current form, I am able to randomly sample a site, but not all possible sites for a given "x i" iteration.

I know this particular issue could be addressed with the 'dredge' function. However, ideally I would set this up in a way so that I could easily [with some adjustment] replace the current linear model function with any other function (e.g., metaMDS, diversity).

I am sure there is a better way to perform this, but I am a relative newbie to these types of analyses. Any suggestions would be greatly appreciated!

Edit: I have been considering passing the below function through 'boot' although I haven't quite been able to get this loop to function in boot.

# data
set.seed(45)
dat <- data.frame(site=rep(LETTERS[1:6],3),mean=sample(1:20,18),rich=sample(5:32,18))

model<-lm(mean~rich,dat) # the full model
summary(model) 


my_vec <- character()            # Create empty character vector
my_site <- character()            # Create empty character vector

for(i in seq(from=1, to=6, by=1)){ # increase number of sites at each iteration
  
dat_seq<-dat %>% subset(site %in% sample(levels(as.factor(site)), i)) # subset data based on number of sites
  
  model<-lm(mean~rich,dat_seq)
  result<-summary(model)$r.squared
  
  my_out<-result
  my_vec<-c(my_vec,my_out)
  my_site<-c(my_site,i)
  
  lm_results<-data.frame(sync=my_vec, site_no = my_site)
  
} 

CodePudding user response:

Something like this might help? Here I generate every combination of sites in the dataset (the combs list) then I lapply the model to the subset of the data corresponding to each element. The upper and lower CI and R^2 are returned.

x <- unique(dat$site)
combs <- do.call(c, lapply(seq_along(x), combn, x = x, simplify = FALSE))

do.call(rbind, lapply( combs , function(x) {
  dat2 = dat[dat$site %in% x,]
  mod = lm(mean~rich, dat2)
  data.frame(sites=paste(x, collapse=""),
             lci=confint(mod)["rich",1],
             uci=confint(mod)["rich",2], 
             r2=summary(mod)$r.squared)
  })
  )


    sites        lci          uci           r2
1       A -8.3174474  7.221600752 0.4453499992
2       B -5.5723683  5.818599482 0.0701472479
3       C -1.8397082  1.928749330 0.0826810176
4       D -3.5504781  2.253774792 0.8895987733
5       E -1.9782218  0.783889792 0.9679338880
6       F -0.3642690  0.202676480 0.9291569087
7      AB -1.0726850  0.631838143 0.1141900799
8      AC -1.0156746  0.486238667 0.1932050717
9      AD -1.3744991  0.089962986 0.5972134174
10     AE -1.3425429  0.359346030 0.3914262598
11     AF -1.2542336  1.094735972 0.0088070439
12     BC -0.3148719  0.536493520 0.1155061842
13     BD -0.8115027  0.263460008 0.3337377806
14     BE -1.0264258  0.376744253 0.2923566879
15     BF -1.1047222  0.961865064 0.0091250127
16     CD -0.9745928  0.341039802 0.3088694252
17     CE -0.9413738  0.549038074 0.1178103209
18     CF -0.8967742  1.165648399 0.0317149663
19     DE -0.8081655 -0.063530819 0.7253472880
20     DF -0.4928491  0.673804531 0.0443092831
21     EF -0.9565739  0.524655918 0.1407909531
22    ABC -0.5962015  0.353999681 0.0493374108
23    ABD -0.8365224  0.110852413 0.3191087122
24    ABE -0.8760695  0.210841908 0.2303024575
25    ABF -0.8266745  0.633602031 0.0137712837
26    ACD -0.9065180  0.066518021 0.3731538462
27    ACE -0.8472338  0.235549937 0.2031338155
28    ACF -0.7522162  0.720252734 0.0003762516
29    ADE -0.9661169 -0.041025998 0.4863258317
30    ADF -0.7657306  0.559208857 0.0190378530
31    AEF -0.8971295  0.489083497 0.0647322193
32    BCD -0.5771897  0.206912590 0.1511964736
33    BCE -0.5802808  0.341276672 0.0509875519
34    BCF -0.5806002  0.737926299 0.0112444750
35    BDE -0.6864459  0.004527069 0.4375645756
36    BDF -0.5930715  0.460544893 0.0124799554
37    BEF -0.8077064  0.411788016 0.0776553121
38    CDE -0.7399438  0.108174895 0.3071099077
39    CDF -0.5535068  0.623295610 0.0028013813
40    CEF -0.6905084  0.598692027 0.0040352416
41    DEF -0.5691343  0.342877359 0.0468583354
42   ABCD -0.6438371  0.095450002 0.2145588181
43   ABCE -0.6248798  0.195737009 0.1195408994
44   ABCF -0.5714679  0.519529413 0.0011238991
45   ABDE -0.7459710 -0.015192501 0.3500598278
46   ABDF -0.6397934  0.354865639 0.0391438801
47   ABEF -0.7297368  0.343203399 0.0605325928
48   ACDE -0.7739688  0.003126375 0.3281841191
49   ACDF -0.6236834  0.433241141 0.0158627591
50   ACEF -0.6696598  0.429949692 0.0230490498
51   ADEF -0.6839477  0.287476657 0.0763805047
52   BCDE -0.5735044  0.083072486 0.2169111169
53   BCDF -0.4853537  0.426339044 0.0020758928
54   BCEF -0.5621108  0.444630022 0.0067151679
55   BDEF -0.5715836  0.240391871 0.0762941714
56   CDEF -0.5364817  0.363030081 0.0181252387
57  ABCDE -0.6208064  0.020647714 0.2391257190
58  ABCDF -0.5292293  0.315066335 0.0225784375
59  ABCEF -0.5621816  0.333684980 0.0228222717
60  ABDEF -0.6093804  0.195345360 0.0867885013
61  ACDEF -0.5890752  0.262323665 0.0502230537
62  BCDEF -0.4898635  0.265972273 0.0305394982
63 ABCDEF -0.5239122  0.198342387 0.0539903463
  • Related