If my latent representation of variational autoencoder(VAE) is r, and my dataset is x, does vae's latent representation follows normalization based on r or x?
- If r= 10, that means it has 10 means and variance (multi-gussain) and distribution comes from data whole data x?
- Or r = 10 constructs one distribution based on r, and every sample try to follow this distribution
I'm confused about which one is correct
CodePudding user response:
VAE constructs a mapping e(x) -> Z (encoder), and d(z) -> X (decoder). This means that every elements of your input space x will be mapped through an encoder e(x) into a single, r-dimensional Gaussian. It is not a "mixture", it is just a single gaussian with diagonal covariance matrix.
CodePudding user response:
I'll add my 2 cents to @lejlot answer.
Your encoder in VAE will map your sample to a distribution, that in your case has 10 dimensions... that distribution is used to say "ok my best estimate of this property of this sample is mu, but I'm not too sure, so consider that it might have variance sigma"
Therefore, you have a distribution for each sample.
However, in order to make sampling easier in VAE, we ask the VAE to keep the distributions as close to a known one, that is the standard normal distribution (we know "where the distributions are located", if you check the latent space in a normal AE you will see that you will have groups far from eachother).