I have a matrix that has consecutive pairs of values from a sequence.
For example, in a sequence like [1,1,3,3,3,4,4,2,4,2,2]
, I would have the following pairs stored in a matrix.
1, 1
1, 3
3, 3
3, 3
3, 4
4, 4
4, 2
2, 4
4, 2
2, 2
And, I want to get the probability of occurrence for each unique pair.
For example, for a pair like (a,b)
, the joint_prob(a,b) = cond_prob(b|a)/prob(a)
(1,1) 0.5
(1,3) 0.5
(3,3) 0.6
and so on..
Is there anyway I can do this in R without having to use many loops? By using built in libraries? Could someone help me do this in an efficient way?
CodePudding user response:
How about this?
d <- c(1,1,3,3,3,4,4,2,4,2,2)
tr <- NULL
for (i in 1:(length(d)-1)) { # all bigrams
tr <- rbind(tr, data.frame(x=d[i], y=d[i 1]))
}
tbl <- table(tr)
joint_prob <- tbl / rowSums(tbl) # joint probability table
joint_prob[1,1]
# 0.5
joint_prob[1,3]
# 0.5
joint_prob[3,3]
# 0.6666667