# CCMA Deep Learning Seminar

## Exploring Redundancy in Neural Networks

• Chenglong Bao, Tsinghua University
• Time: 8:00am-9:00am, Feb. 11th, 2020.

## Deep learning Theory: F-Principle

• Zhiqin Xu, Shanghai Jiao Tong University
• Time: 8:00am-9:00am, Feb. 18th, 2020.

## Theoretical Understanding of Stochastic Gradient Descentin Deep Learning

• Zhanxing Zhu, Peking University
• Time: 8:00am-9:00am, Mar. 11th, 2020.

## Low Dimensional Manifold Model for Image Processing

• Zuoqiang Shi, Tsinghua University
• Time: 8:00am-9:00am, Mar. 25th, 2020.

## Scale-Equivariant Neural Networks with Decomposed Convolutional Filters

• Wei Zhu, Duke University
• Time: 8:00am-9:00am, Apr. 7th, 2020.
• Abstract: Encoding the input scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many vision tasks especially when dealing with multiscale input signals. We propose a scale-equivariant CNN architecture with joint convolutions across the space and the scaling group, which is shown to be both sufficient and necessary to achieve scale-equivariant representations. To reduce the model complexity and computational burden, we decompose the convolutional filters under two pre-fixed separable bases and truncate the expansion to low-frequency components. A further benefit of the truncated filter expansion is the improved deformation robustness of the equivariant representation. Numerical experiments demonstrate that the proposed scale-equivariant neural network with decomposed convolutional filters (ScDCFNet) achieves significantly improved performance in multiscale image classification and better interpretability than regular CNNs at a reduced model size.

## Weighted Nonlocal Laplacian and Its Application in Semi-supervised Learning and Image Inpainting

• Shixiao Jiang, Penn State University
• Time: 8:00am-9:00am, Apr. 14th, 2020.
• Abstract: In this talk, we introduce a weighted nonlocal Laplacian method in Ref. [Shi, Osher, Zhu 2017, Weighted Nonlocal Laplacian] to compute a continuous interpolation function on a point cloud in high dimensional space. We will first focus on their algorithm of weighted nonlocal Laplacian and compare it with the classical graph Laplacian. Then we will see the underlying theory of the weighted nonlocal Laplacian. Last, supporting numerical demonstration in semi-supervised learning and image inpainting show that the weighted nonlocal Laplacian is a reliable and efficient interpolation method which performs better than the classical graph Laplacian.

## Sparse reconstruction by $l_p$ minimization

• Zhe Zheng, Nankai University
• Time: 8:00am-9:00am, Apr. 21st, 2020.
• Abstract: In this talk, we will focus on the model with $l_p$ regularization, which is a good alternative to the $l_0$ norm. We first introduce the lower bound theory for the nonzero entry of solutions and then we turn to algorithms for solving this nonconvex problem. In particular, we will focus on the forward-backward splitting method. With the Kurdyka-Łojasiewicz property(KŁ property), we will see the convergence result of the algorithm.

## Machine Learning from a Continuous Viewpoint

• Lei Wu, Princeton University
• Time: 8:30pm-9:30pm, May 5th, 2020.
• Abstract: We present a continuous formulation of machine learning, as a problem in the calculus of variations and differential-integral equations, very much in the spirit of classical numerical analysis and statistical physics. We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the shallow neural network model and the residual neural network model, can all be recovered as particular discretizations of different continuous formulations. We also present examples of new models, such as the flow-based random feature model, and new algorithms, such as the smoothed particle method and spectral method, that arise naturally from this continuous formulation. We discuss how the issues of generalization error and implicit regularization can be studied under this framework.

## Normalization Methods: Auto-tuning Stepsize and Implicit Regularization

• Xiaoxia (Shirley) Wu, UT Austin
• Time: 8:30pm-9:30pm, May 12th, 2020.
• Abstract: Neural network optimization with stochastic gradient descent (SGD) require many interesting techniques including normalization methods such as batch normalization (Ioffe and Szegedy, 2015), and adaptive gradient methods such as ADAM (Kingma and Ba, 2014), to attain optimal performance. While these methods are successful, their theoretical understanding has only recently started to emerge. A significant challenge in understanding these methods is the highly non-convex and non-linear nature of neural networks.
In this talk, I will present an interesting connection between normalization methods and adaptive gradient methods, and provide rigorous justification for why these methods require less hyper-parameters tuning. Meanwhile, I will talk about convergence results for adaptive gradient methods in general non-convex landscapes and two-layer over-parameterized neural networks.  Beyond convergence, I will also show a new perspective on the implicit regularization in these normalization algorithms.

## Title

Maintained by Juncai He.