So I have this code that does a for loop but doesn't actually update my dataframe after the first run. If I do two runs without the for loop it works just fine. I'm probably overlooking something obvious but really don't see it.
Here is the code:
#preparation
data <- c("MARRIED", "DIVORCED", "MARRIED", "SEPARATED", "DIVORCED", "NEVER MARRIED", "DIVORCED", "DIVORCED", "NEVER MARRIED", "MARRIED", "MARRIED", "MARRIED", "SEPARATED", "DIVORCED", "NEVER MARRIED", "NEVER MARRIED", "DIVORCED", "DIVORCED", "MARRIED")
observed <- c(table(data))
n <- sum(observed)
k <- length(observed)
expProb =rep(1/k, k)
pObs = dmultinom(sort(observed, decreasing=TRUE), size=n, expProb)
counts <- seq(0, n, by = 1)
kCounts <- matrix(,nrow=n 1, ncol=k)
for (i in 1:k){
kCounts[,i] <- counts
}
all_perm <- merge(kCounts[,1], as.data.frame(kCounts[,2]),all=TRUE)
all_perm <- all_perm[rowSums(all_perm) <= n,]
#THE FOR LOOP THAT DOESN'T WORK
for (i in 3:k){
print(i)
all_perm <- merge(all_perm, as.data.frame(kCounts[,i]),all=TRUE)
all_perm <- all_perm[rowSums(all_perm) <= n,]
print(dim(all_perm))
}
This will nicely print the the correct i (3 and 4) but the dimensions of all_perm remain with 3 columns instead of 4. The number of rows does change.
If I run the two (3 and 4) directly it does work, i.e. replacing the #THE FOR LOOP THAT DOESN'T WORK part to:
all_perm <- merge(all_perm, as.data.frame(kCounts[,3]),all=TRUE)
all_perm <- all_perm[rowSums(all_perm) <= n,]
all_perm <- merge(all_perm, as.data.frame(kCounts[,4]),all=TRUE)
all_perm <- all_perm[rowSums(all_perm) <= n,]
dim(all_perm)
It correctly shows that all_perm now has 4 columns.
I really don't get why the for loop doesn't work. I tried also a while loop but also that doesn't work. Any help would be appreciated.
Purpose: This code is part of function I'm trying to make for myself, where I'm trying to perform a multinomial test. It's just for a theoretical exercise to understand how the test works. The easy way to perform a multinomial test is by either using the EMT library and the 'multinomial.test' function, or xnomial library and the 'xmulti' function.
CodePudding user response:
Assuming you wanted a dataframe of k columns, where each row is a unique n-element sample (with replacement) of the numbers 0:n
and the rowSum
does not exceed n, here's one "loopless" approach:
- get the combinations and
t
ranspose the resulting k × n matrix to n × k:
all_perms <- t(combn(rep(0:n, times = k), k))
> all_perms |> head(3)
[,1] [,2] [,3] [,4]
[1,] 0 1 2 3
[2,] 0 1 2 4
[3,] 0 1 2 5
> nrow(all_perms)
[1] 1581580
- keep only rows with row sums <= n
all_perms <-
subset(all_perms,
rowSums(all_perms) <= n
)
> nrow(all_perms)
[1] 81769
- if needed, convert to dataframe and sort (V1 changing fastest):
library(dplyr) ## for convenient multi-column sort
all_perms <-
all_perms |>
as.data.frame() |>
arrange(V4, V3, V2, V1)
> all_perms |> head(3)
V1 V2 V3 V4
1 0 0 0 0
2 1 0 0 0
3 2 0 0 0