Home > Net >  Extract non repeated numbers from a list based on gaussian distribution in matlab
Extract non repeated numbers from a list based on gaussian distribution in matlab

Time:12-09

I have an array of numbers. Lets say

nums = [1, 2, 3, 4, 5, 6, 7, 8 ,9 ,10]

from this, I want to randomly pick six numbers based on Gaussian distribution. That means that the probability of picking 5 and 6 is higher than picking 1 and 10. IN addition to this I want to make sure that all six numbers that I pick must be unique. For example

1, 4, 5,7, 8, 10 is an acceptable output.

I want to do this in matlab and I am a total newbie in matlab I was hoping if someone can help me with this.

CodePudding user response:

A possible solution is generating a sufficiently large (for example 100) number of samples and using unique with 'stable' option to extract the first 6 non repeated samples:

data = 1:10;
mn = min(data)
mx = max(data);
m = mean(data);
s = std(data);

random_data = m   randn(1, 100) * s;
random_data = round(random_data(random_data > mn-0.5 & random_data < mx 0.5));
u = unique(random_data, 'stable');
result = u(1:6);

CodePudding user response:

The essence of your question is: how do I make a weighted random permutation, i.e., sample without replacement? The difficult part is only the "without replacement" part, in my opinion.

With the approach below, I converted the nums values into expected probabilities with the normpdf() function, and then into frequencies/counts by simple arithmetic. I increased the likeness of the counts values to the probabilities with a scale factor of 100. Note: If resampling with replacement were the goal, we could simply use these probabilities in place of the weight argument in the randsample() function.

To get the without replacement sampling, I then expanded the nums vector with the repelem() function. With this expanded form, we have an array that represents varied probabilities for each value in nums. I then randomly sample from the array, removing the unique sampled value on each iteration, until I have satisfied the sampling size requirement.

I hope this helps!

% create data array
nums = 1:10;

% create normal probability density from nums
probs = normpdf(nums,mean(nums),std(nums));

% convert probability in to frequency/counts
counts = ceil(probs./min(probs).*100);
% expand nums by counts
numsExpanded = repelem(nums,counts);

% shuffle for extra-randomness
numsExpanded(randperm(sum(counts))) = numsExpanded;

% initialize sampling parameters
nSamples = 6;
sampleValues = [];

while numel(sampleValues) < nSamples
  sampleValues(end 1) = randsample(numsExpanded,1);
  % remove sampled value to prevent replacement
  numsExpanded(numsExpanded == sampleValues(end)) = [];
end

disp(sampleValues);
  • Related