I have a question about multivariate kernel density in matlab, which is my first time using it.
I have a 3-dimensional sample data (x, y, z in axes) and want to find a probability of being in a certain volume using kernel density estimation. So, I used the mvksdensity function in matlab and got the probability density (estimated function values) for the points I decided.
What I originally wanted to do was to (if I could fine the function) triple integral the multivariate function for a given volume. But the mvksdensity function only returns the density estimates and does not return the function. I thought there will be an easy way to compute the probability from the density, but I’m stuck. Does anyone have any useful information for this? Thanks in advance.
I thought about fitdist function to find the distribution, but it only works for univariate kernel distribution.
I also tried to use mvncdf, which is a function that returns the cdf of the multivariate normal distribution for the row of the sample data after setting the mean and the std. But then I have to calculate the probability for a given volume for every normal distribution in each data point and then add it, which will be inefficient for a large amount of data and I don't know if it's a correct way.
CodePudding user response:
I can suggest the following Monte-Carlo approach. You find a master volume that contains the entire mass of the estimated probability density. This should be as small as possible for the sake of efficiency. Then you generate a large number of test points in the master volume, either on a grid or randomly according to a uniform distribution. The probability content of a specific volume V can be estimated by the sum of the density values of the test points in V over the sum of the density values of all test points. I am afraid, however, that in 3D you would need at least 1E6 test points, probably more. If you give me access to your sample, I would be pleased to try out my suggestion. It should also be fairly easy to work out an estimate of the standard error of the estimated probability content of V.