Deep Bayes: Discrete Latent Variables
Introduction:
这篇笔记会记录一些离散隐变量模型,转载请注明。
Reference:Deep Bayes
Motivation
- Easier to interpret discrete categories than continuous spectrum
example: discrete variational autoencoder - Allow the model to make a discrete choice
example: hard attention
An attention module generates binary mask of where to look at
The network classifies masked images
We want attention module to attend only important areas of the image.
Reinforce Estimator
However, this typically has large variance
Requires sophisticated Variance Reduction methods
Just taking bigger M gives only a modest improvement.
Idea: Relax the objective over discrete random samples z into an objective oven continuous random samples during training and use the reparametrization trick:
Gumbel-Max trick
Some ideas about Gumbel Distribution:
https://qinqianshan.com/math/probability_distribution/gumbel-distribution/
Variance Reduction
Control Variates
Consider some with tractable expectation . Then
Simple Baselines:
Constant baseline
Variance Minimization:
Gumbel-Relaxed Baselines:
算法小屋 文章被收录于专栏
不定期分享各类算法以及面经。同时也正在学习相关分布式技术。欢迎一起交流。