A Double-Variational Bayesian Framework in Random Fourier Features for Indefinite Kernels

Liu, Fanghui; Huang, Xiaolin; Shi, Lei; Yang, Jie; Suykens, Johan AK

doi:10.1109/TNNLS.2019.2934729

A Double-Variational Bayesian Framework in Random Fourier Features for Indefinite Kernels

Author:

Liu, Fanghui

Huang, Xiaolin ; Shi, Lei ; Yang, Jie ; Suykens, Johan AK

Keywords:

Science & Technology, Technology, Computer Science, Artificial Intelligence, Computer Science, Hardware & Architecture, Computer Science, Theory & Methods, Engineering, Electrical & Electronic, Computer Science, Engineering, Kernel, Bayes methods, Approximation algorithms, Inference algorithms, Acceleration, Learning systems, Gaussian mixture model, Indefinite kernel, kernel approximation, random Fourier features (RFFs), variational inference, INFERENCE, SVM, STADIUS-19-171

Abstract:

Random Fourier features (RFFs) have been successfully employed to kernel approximation in large-scale situations. The rationale behind RFF relies on Bochner's theorem, but the condition is too strict and excludes many widely used kernels, e.g., dot-product kernels (violates the shift-invariant condition) and indefinite kernels [violates the positive definite (PD) condition]. In this article, we present a unified RFF framework for indefinite kernel approximation in the reproducing kernel Kreĭn spaces (RKKSs). Besides, our model is also suited to approximate a dot-product kernel on the unit sphere, as it can be transformed into a shift-invariant but indefinite kernel. By the Kolmogorov decomposition scheme, an indefinite kernel in RKKS can be decomposed into the difference of two unknown PD kernels. The spectral distribution of each underlying PD kernel can be formulated as a nonparametric Bayesian Gaussian mixtures model. Based on this, we propose a double-infinite Gaussian mixture model in RFF by placing the Dirichlet process prior. It takes full advantage of high flexibility on the number of components and has the capability of approximating indefinite kernels on a wide scale. In model inference, we develop a non-conjugate variational algorithm with a sub-sampling scheme for the posterior inference. It allows for the non-conjugate case in our model and is quite efficient due to the sub-sampling strategy. Experimental results on several large classification data sets demonstrate the effectiveness of our nonparametric Bayesian model for indefinite kernel approximation when compared to other representative random feature-based methods.