Towards Deep Adaptive Hinging Hyperplanes

Tao, Q; Xu, J; Li, Z; Xie, N; Wang, S; Li, X; Suykens, Johan

doi:10.1109/TNNLS.2021.3079113

Towards Deep Adaptive Hinging Hyperplanes

Author:

Tao, Q

Xu, J ; Li, Z ; Xie, N ; Wang, S ; Li, X ; Suykens, Johan

Keywords:

Science & Technology, Technology, Computer Science, Artificial Intelligence, Computer Science, Hardware & Architecture, Computer Science, Theory & Methods, Engineering, Electrical & Electronic, Computer Science, Engineering, Neurons, Artificial neural networks, Network topology, Training, Topology, Optimization, Adaptive systems, Adaptive hinging hyperplanes (AHHs), analysis of variance (ANOVA) decomposition, domain partition, piecewise linear (PWL), skip-layer connection, REGRESSION, SELECTION, Neural Networks, Computer, Algorithms, Brain, STADIUS-21-86

Abstract:

The adaptive hinging hyperplane (AHH) model is a popular piecewise linear representation with a generalized tree structure and has been successfully applied in dynamic system identification. In this article, we aim to construct the deep AHH (DAHH) model to extend and generalize the networking of AHH model for high-dimensional problems. The network structure of DAHH is determined through a forward growth, in which the activity ratio is introduced to select effective neurons and no connecting weights are involved between the layers. Then, all neurons in the DAHH network can be flexibly connected to the output in a skip-layer format, and only the corresponding weights are the parameters to optimize. With such a network framework, the backpropagation algorithm can be implemented in DAHH to efficiently tackle large-scale problems and the gradient vanishing problem is not encountered in the training of DAHH. In fact, the optimization problem of DAHH can maintain convexity with convex loss in the output layer, which brings natural advantages in optimization. Different from the existing neural networks, DAHH is easier to interpret, where neurons are connected sparsely and analysis of variance (ANOVA) decomposition can be applied, facilitating to revealing the interactions between variables. A theoretical analysis toward universal approximation ability and explicit domain partitions are also derived. Numerical experiments verify the effectiveness of the proposed DAHH.