I'm working on populating a binary matrix based on values from a different table. I can create the matrix but am struggling with the looping needed to populate it. I think this is a pretty simple issue so I hope I can get some easy help.
Here's an example of my data:
start <- c(291, 291, 291, 702, 630, 768)
sequence <- c("chr9:103869456:103870456", "chr5:30823103:30824103", "chr11:49801703:49802703", "chr4:133865601:133866601", "chr12:55738034:55739034", "chr8:96569493:96570493")
motif <- c("ARI5B", "ARI5B", "ARI5B", "ATOH1", "EGR1", "EGR1")
df <- data.frame(start, sequence, motif)
I have created a character vector for each unique motif start values like so:
x <- sprintf("%s_%d", df$motif, df$start)
x <- unique(x)
Next I create a binary matrix with the sequences as rows and the values from x as columns:
binmat <- matrix(0, nrow = length(df$sequence), ncol = length(x))
rownames(binmat) <- df$sequence
colnames(binmat) <- x
And now I'm stuck. I want to iterate through columns and rows and put a 1 in each position that has a match. For example, the first sequence is "chr9:103869456:103870456" and it has motif "ARI5B" at starting position 291, so it should get a 1 while the rest of the values in that row remain at 0. The output of this example should look like this:
ARI5B_291 ATOH1_702 EGR1_630 EGR1_768
chr9:103869456:103870456 1 0 0 0
chr5:30823103:30824103 1 0 0 0
chr11:49801703:49802703 1 0 0 0
chr4:133865601:133866601 0 1 0 0
chr12:55738034:55739034 0 0 1 0
chr8:96569493:96570493 0 0 0 1
But so far I am unsuccessful. I think I need a double for loop somewhere along these lines:
for (row in binmat){
for (col in binmat){
if (row && col %in% x){
1
} else { 0
}
}
}
But all I get are 0s.
Thanks in advance!
CodePudding user response:
Aren't you just looking for table
here? You can get the result as a vectorized one-liner, without loops, by doing:
table(factor(df$sequence, df$sequence), sprintf("%s_%d", df$motif, df$start))
ARI5B_291 ATOH1_702 EGR1_630 EGR1_768
chr9:103869456:103870456 1 0 0 0
chr5:30823103:30824103 1 0 0 0
chr11:49801703:49802703 1 0 0 0
chr4:133865601:133866601 0 1 0 0
chr12:55738034:55739034 0 0 1 0
chr8:96569493:96570493 0 0 0 1