I have an NxR random matrix which has in each row shuffled 1:R. I need another matrix with different sorts of shuffling in each row to have the same distribution of 1 to R for each column. In other words, is there any way to shuffle rows of a matrix while keeping the frequency of each column the same?
I attached a screenshot of matrix B which I generated manually based on Matrix A. SS of matrix A and B
Both have 1 to 4 in each row and the distribution of columns of A matches with columns of B. For example, in the first column of both matrices, there are two 1s, four 2s, one 3, and three 4s.
Is there any way to write an algorithm to generate the matrix B for larger dimensions?
The following is my code in which I tried my luck hoping I could get the solution from randomness, but I was unable to.
clear
n=20;
r= 8;
A= cell2mat(arrayfun(@randperm, repmat(r, 1, n), 'UniformOutput', false)'); % a random mtx
%Frequency of each column
for i=1:r
F_A(:,i) = histcounts(A(:,i), r);
end
k=0;
while true
B= A ;
for i=1:n
idx = randperm(r);
B(i, idx) = A(i,:) ;
end
for i=1:r
F_B(:,i) = histcounts(B(:,i), r); %Frequency of each column
end
k=k 1;
if sum(F_A == F_B, 'all') == r^2 || k >100000 % stop if frequency of A and B are similar
break
end
end
sum(F_A == F_B, 'all')
sum(A==B, 'all')
CodePudding user response:
Here is an O(N^2*R/2)
time algorithm to solve this problem. It might sacrifice some randomness due to the nature of the constraints, but you can apply multiple iterations of the algorithm if you want higher degree of randomness. The run time of the algorithm is much better than the exponential time of random algorithm that has no guarantee of convergence.
The steps are simple,
- Initialize
B
matrix as a row-shuffled version ofA
. - Walk throw all elements of
B
scanning row by row:- pick a random element
(i,k)
in the same row as the current(i,j)
element - if the same
(i,j)
element is picked, skip to the next(i,j 1)
element - search down in the
jth
andkth
columns for the reversed pair(l,k)
and(l,j)
- now, we can interchange
(i,j)
,(i,k)
and(l,j)
,(l,k)
without unbalancing the column frequency
- pick a random element
- Repeat step 2 if more randomness is required.
So, the idea is to randomly interchange two elements in the same row if there exists an exact reversed pair in another row below. This introduces an unbalance in two columns and is fixed by the reversed change in the other row below; the net change in column frequency will be zero.
clc, clear
rng(123) % for reproducible outputs
N = 10; R = 4;
A = zeros(N,R);
for i = 1:N
A(i,:) = randperm(R);
end
% Initialize B as shuffled rows of A
B = A(randperm(N),:);
for i = 1:N
for j = 1:R
k = randi(R); % pick random element in the row
if j == k, continue, end % if same element, skip
for l = i 1:N % search down for the 2 reverse elements
if B(l,[j k]) == B(i,[k j]) % if found
B(l,[j k]) = B(l,[k j]); % replace the reverse elements
B(i,[j k]) = B(i,[k j]); % replace the k, j elements
break
end
end
end
end
A sample output of 1 iteration:
A =
3 2 4 1
2 4 1 3
3 2 1 4
2 3 1 4
2 1 3 4
4 1 3 2
4 2 3 1
3 1 4 2
4 3 1 2
1 4 2 3
B =
2 3 4 1
1 4 2 3
3 2 1 4
4 1 3 2
4 2 3 1
2 3 1 4
4 1 3 2
3 2 1 4
2 1 4 3
3 4 1 2
CodePudding user response:
There are a matlab function shuffle, or you can take a look this question.