Density estimation using deep generative neural networks期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Density estimation using deep generative neural networks

Authors:	Qiao Liu Jiaze Xu Rui Jiang Wing Hung Wong

Abstract:	Density estimation is one of the fundamental problems in both statistics and machine learning. In this study, we propose Roundtrip, a computational framework for general-purpose density estimation based on deep generative neural networks. Roundtrip retains the generative power of deep generative models, such as generative adversarial networks (GANs) while it also provides estimates of density values, thus supporting both data generation and density estimation. Unlike previous neural density estimators that put stringent conditions on the transformation from the latent space to the data space, Roundtrip enables the use of much more general mappings where target density is modeled by learning a manifold induced from a base density (e.g., Gaussian distribution). Roundtrip provides a statistical framework for GAN models where an explicit evaluation of density values is feasible. In numerical experiments, Roundtrip exceeds state-of-the-art performance in a diverse range of density estimation tasks. Let $p (\cdot)$ be a density on a $n$ -dimensional Euclidean space $χ$ . The task of density estimation is to estimate $p (\cdot)$ based on a set of independently and identically distributed data points ${x_{i}}_{i = 1}^{N}$ drawn from this density.Traditional density estimators such as histograms (1, 2) and kernel density estimators (KDEs) (3, 4) typically perform well only in low dimension. Recently, neural network-based approaches were proposed for density estimation and yielded promising results in problems with high-dimensional data points such as images. There are mainly two families of such neural density estimators: autoregressive models (5 –7) and normalizing flows (8 –11). Autoregression-based neural density estimators decompose the density into the product of conditional densities based on probability chain rule $p (x) = \prod_{i} p (x_{i} \| x_{1 : i - 1})$ . Each conditional probability $p (x_{i} \| x_{1 : i - 1})$ is modeled by a parametric density (e.g., Gaussian or mixture of Gaussian), of which the parameters are learned by neural networks. Density estimators based on normalizing flows represent $x$ as an invertible transformation of a latent variable $z$ with known density, where the invertible transformation is a composition of a series of simple functions whose Jacobian is easy to compute. The parameters of these component functions are then learned by neural networks.As suggested in ref. 12, both of these are special cases of the following general framework. Given a differentiable and invertible mapping $G : R^{n} \to R^{n}$ and a base density $p_{z} (z)$ , the density of $x = G (z)$ can be represented using the change of variable rule as follows: $p_{x} (x) = p_{z} (z) {\| det (J_{z}) \|}^{- 1},$ [1]where $J_{z} = (\partial G (z)) / \partial z^{T}$ is the Jacobian matrix of function $G (\cdot)$ at point $z$ . Density estimation at $x$ can be solved if the base density $p_{z} (z)$ is known and the determinant of Jacobian matrix is feasible to calculate. To achieve this, previous neural density estimators have to impose heavy constraints on the model architecture. For example, refs. 7, 10, and 12 require the Jacobian to be triangular, ref. 13 constructed low rank perturbations of a diagonal matrix as the Jacobian, and ref. 14 proposed a circular convolution where the Jacobian is a circulant matrix. These strong constraints diminish the expressiveness of neural networks, which may lead to poor performance. For example, autoregressive neural density estimators based on learning $p (x_{i} \| x_{1 : i - 1})$ are naturally sensitive to the order of the features. Moreover, the change of variable rule is not applicable when the domain dimension in base density differs from target density. However, experiences from deep generative models [e.g., GAN (15) and VAE (16)] suggested that it is often desirable to use a latent space of smaller dimension than the data space.To overcome the limitations above, we propose a neural density estimator called Roundtrip. Our approach is motivated by recent advances in deep generative neural networks (15, 17, 18). Roundtrip differs from previous neural density estimators in two ways. 1) It allows the direct use of a deep generative network to model the transformation from the latent variable space to the data space, while previous neural density estimators use neural networks only to learn the parameters in the component functions that are used for building up an invertible transformation. 2) It can efficiently model data densities that are concentrated near learned manifolds, which is difficult to achieve by previous approaches as they require the latent space to have the same dimension as the data space. Importantly, we also provide methods, based on either importance sampling and Laplace approximation, for the pointwise evaluation of the density estimate. We summarize our major contributions in this study as follows: 1) We propose a general-purpose neural density estimator based on deep generative models, which requires less restrictive model assumptions compared to previous neural density estimators. 2) We show that the principle in previous neural density estimators can be regarded as a special case in our Roundtrip framework. 3) We demonstrate state-of-the-art performance of Roundtrip model through a series of experiments, including density estimation tasks in simulations as well as in real data applications ranging from image generation to outlier detection.

Keywords:	density estimation neural network deep learning importance sampling GAN

设为首页 | 免责声明 | 关于勤云 | 加入收藏