Suppose I have some (graph edge) data that can be generated like this:
data.gen = function(g,n){
m = (length(g)-1)*n
gm = rep(g,each=m)
x = data.frame(g=gm,v1=paste0(gm,1,seq(m)),v2=paste0(gm,2,seq(m)))
}
e.g. data.gen(c('a','b','c'),1)
produces:
g v1 v2
a a11 a21
a a12 a22
b b11 b21
b b12 b22
c c11 c21
c c12 c22
I want to redistribute (reorder) the data in column v2
so that all values from each group g
end up in a row corresponding to a different group. For context, I'm rewiring a random number of edges between disjoint subgraphs (separate "islands" of mutually connected nodes). In the example above, one solution would be:
g v1 v2
a a11 b21
a a12 c21
b b11 a21
b b12 c22
c c11 a22
c c12 b22
The interpretation of n
is the number of edges connecting each pair of subgraphs. How can I code this up for arbitrary g
,n
?
CodePudding user response:
Please confirm: Two v2 values from one original group should never be moved to the same new group? This will only be possible if no group contain more rows than one less than the number of groups
LOOP over groups
IF Group contain more rows than one less than the number of groups
STOP no solution
ig = -1 // index of group being rewired
LOOP over groups
ig
LOOP over rows in group
ng = -1 // index of destination group
LOOP over groups
ng
IF ng == ig
CONTINUE
Move v2 in row to group ng
BREAK
CodePudding user response:
Thanks for the pseudocode @ravenspoint. I am including the R
code solution here:
data.gen = function(g,n){ # data generating
m = (length(g)-1)*n
gm = rep(g,each=m)
x = data.frame(g=gm,v1=paste0(gm,1,seq(m)),v2=paste0(gm,2,seq(m)))
}
data.rewire = function(x,n){ # solution
# could technically compute n from nrow(x)
v2.old = lapply(split(x,x$g),function(xg){ as.character(xg$v2) })
v2.new = lapply(v2.old,function(.){ NULL })
for (g.new in names(v2.new)){
for (g.old in names(v2.old)){
if (g.new != g.old){
v2.new[[g.new]] = c(v2.new[[g.new]],v2.old[[g.old]][seq(n)])
v2.old[[g.old]] = v2.old[[g.old]][-seq(n)]
}
}
}
x$v2 = do.call(c,v2.new)
return(x)
}
# testing
g = c('a','b','c')
n = 1
x.old = data.gen(g,n)
x.new = data.rewire(x.old,n)
# print(x.old)
print(x.new)
yields:
g v1 v2
1 a a11 b21
2 a a12 c21
3 b b11 a21
4 b b12 c22
5 c c11 a22
6 c c12 b22
Note: this will use the edges in order they appear, e.g. first n
element(s) of group b
are moved to group a
, etc. To get a random order, you can simply add sample()
around the as.character
.