CCMA Seminar on Mathematics of Data and Computation

This PSU-PKU joint CCMA seminar aims to introduce cutting-edge research of numerical methods and scientific computing related to data science and  machine learning and are held weekly on every Thursday 8:30-9:30 pm EST (Friday 8:30-9:30 am Beijing time). If you want to give a talk, please contact Lian Zhang (luz244@psu.edu).

(Due to the COVID-19, we move all the seminars online, ZOOM: https://psu.zoom.us/j/560094163)

 


Upcoming Seminar


Statistical Method for Selecting Best Treatment with High-dimensional Covariates

  • Xiaohua Zhou, Peking University
  • Time: 8:30pm – 9:30pm, Thursday, June 11, 2020 (EST).
  • Online Zoom Link: https://psu.zoom.us/j/560094163
  • Abstract: In this talk,  I will introduce  a new semi-parametric modeling method for heterogeneous treatment effect estimation and individualized treatment selection using a covariate-specific treatment effect (CSTE) curve with high-dimensional covariates.   The proposed method is quite flexible to depict both local and global associations between the treatment and baseline covariates, and thus is robust against model mis-specification in the presence of high-dimensional covariates. We also establish the theoretical properties of our proposed procedure.  I will  further illustrate  the performance of  the proposed method by simulation studies and analysis of a real data example.  This is a joint work with Drs.  Guo and Ma at University of California at Riverside.

    Past Seminar


    Homotopy training algorithm for neural networks and applications in solving nonlinear PDE

    • Wenrui Hao, Pennsylvania State University
    • Time: 8:30pm – 9:30pm, Thursday, May 21, 2020 (EST).
    • Online Zoom Link: https://psu.zoom.us/j/560094163
    • Abstract: In this talk, I will introduce two different topics related to neural networks. The first one is a homotopy training algorithm that is designed to solve the nonlinear optimization problem of machine learning via building the neural network adaptively. The second topic is a randomized Newton’s method that is used to solve nonlinear systems arising from the neural network discretization of differential equations. Several examples are used to demonstrate the feasibility and efficiency of two proposed methods.

    From ODE solvers to accelerated first-order methods for convex optimization

    • Long Chen, University of California, Irvine
    • Time: 8:30pm – 9:30pm, Thursday, May 14, 2020 (EST).
    • Online Zoom Link: https://psu.zoom.us/j/560094163
    • Abstract: Convergence analysis of accelerated first-order methods for convex optimization problems are presented from the point of view of ordinary differential equation (ODE) solvers. We first take another look at the acceleration phenomenon via A-stability theory for ODE solvers and present a revealing spectrum analysis for quadratic programming. After that, we present the Lyapunov framework for dynamical system and introduce the strong Lyapunov condition. Many existing continuous convex optimization models, such as gradient flow, heavy ball system, Nesterov accelerated gradient flow, and dynamical inertial Newton system etc, are addressed and analyzed in this framework. Then we present convergence analyses of optimization algorithms obtained from implicit or explicit methods of underlying dynamical systems. This is a joint work with Hao Luo from Sichuan University

    The Geometry of Functional Spaces of Neural Networks

    • Matthew Trager, Courant Institute at NYU
    • Time: 8:30pm – 9:30pm, Thursday, May 7, 2020 (EST).
    • Online Zoom Link: https://psu.zoom.us/j/560094163
    • Abstract: The reasons behind the empirical success of neural networks are not well understood. One important characteristic of modern deep learning architectures compared to other large-scale parametric learning models is that they identify a class of functions that is non-linear, but rather has a complex hierarchical structure. Furthermore, neural networks are non-identifiable models, in the sense that different parameters may yield the same function. Both of these aspects come into play significantly when optimizing an empirical risk in classification or regression tasks.
      In this talk, I will present some of my recent work that studies the functional space associated with neural networks with linear, polynomial, and ReLU activations, using ideas from algebraic and differential geometry. In particular, I will emphasize the distinction between the intrinsic function space and its parameterization, in order to shed light on the impact of the architecture on the expressivity of a model and on the corresponding optimization landscapes.

    Interpreting Deep Learning Models: Flip Points and Homotopy Methods

    • Roozbeh Yousefzadeh, Yale University
    • Time: 8:30 pm – 9:30 pm, April 30 , 2020 (EST).
    • Online Zoom Link: https://psu.zoom.us/j/560094163
    • Abstract: This talk concerns methods for studying deep learning models and interpreting their outputs and their functional behavior. A trained model (e.g., a neural network), is a function that maps inputs to outputs. Deep learning has shown great success in performing different machine learning tasks; however, these models are complicated mathematical functions, and their interpretation remains a challenging research question. We formulate and solve optimization problems to answer questions about the model and its outputs. Specifically, we study the decision boundaries of a model using flip points. A flip point is any point that lies on the boundary between two output classes: e.g. for a neural network with a binary yes/no output, a flip point is any input that generates equal scores for “yes” and “no”. The flip point closest to a given input is of particular importance, and this point is the solution to a well-posed optimization problem. To compute the closest flip point, we develop a homotopy algorithm for neural networks that transforms the deep learning function in order to overcome the issues of vanishing and exploding gradients. We show that computing closest flip points allows us to systematically investigate the model, identify decision boundaries, interpret and audit the model with respect to individual inputs and entire datasets, and find vulnerability against adversarial attacks. We demonstrate that flip points can help identify mistakes made by a model, improve the model’s accuracy, and reveal the most influential features for classifications.


    Partial Differential Equation Principled Trustworthy Deep Learning

    • Bao Wang, University of California, Los Angeles
    • Time: 9:00am – 10:30am, Apr. 24th, 2020 (EST).
    • Online Zoom Link: https://psu.zoom.us/j/560094163
    • Abstract: This talk contains two parts: In the first part, I will present some recent work on developing partial differential equation principled robust neural architecture and optimization algorithms for robust, accurate, private, and efficient deep learning. In the second part, I will discuss some recent progress on leveraging Nesterov accelerated gradient style momentum for accelerating deep learning, which again involves designing stochastic optimization algorithms and mathematically principled neural architecture.

     


    Topology Data Analysis (TDA) Based

    Machine Learning Models for Drug Design

    • Kelin Xia, Nanyang Technological University
    • Time: 9:00am – 10:30am, Apr. 17th, 2020 (EST).
    • Online Zoom Link: https://psu.zoom.us/j/560094163
    • Abstract: In this talk, I will talk about topological data analysis (TDA) and its application in biomolecular data analysis, in particular, drug design. Persistent homology, which is one of the most important tools in TDA, is used in identification, classification and analysis of biomolecular structure, flexibility, dynamics, and functions. Properties from persistent homology analysis are used as features for learning models. Unlike previous biomolecular descriptors, topological features from TDA, provide a balance between structural complexity and data simplification. TDA-based learning models have consistently delivered the best results in various aspects of drug design, including protein-ligand binding affinity prediction, solubility prediction, protein stability change upon mutation prediction, toxicity prediction, solvation free energy prediction, partition coefficient and aqueous solubility,  binding pocket detection, and drug discovery.  Further, I will discuss our recently-proposed persistent spectral based machine learning (PerSpect ML) models. Different from all previous spectral models, a filtration process is introduced to generate multiscale spectral models. Persistent spectral variables are defined as the function of spectral variables over the filtration value. We test our models on the most commonly-used databases, including PDBbind-2007, PDBbind-2013, and PDBbind-2016. Our results are better than all existing models, for all these databases, as far as we know. This demonstrates the great power of our PerSpect ML in molecular data analysis and drug design.

     


Learning Dynamical Systems from Data

    • Tong Qin, Ohio State University
    • Time: 9:00am – 10:30am, Apr. 10th, 2020 (EST).
    • Online Zoom Link: https://psu.zoom.us/j/560094163
    • Abstract: The recent advancement of computing resources and the accumulation of gigantic amount of data give rise to an explosive growth in the development of data-driven modeling, which is becoming as equally important as the traditional first principles based modeling. In this talk, I will introduce our recent work on designing artificial neural networks for approximating the governing equation in the equation-free input-output mapping form. The dynamical system can be unknown physical models, or semi-discretized PDEs. Basing on the one-step integral form of the ODEs, we propose to use the residual network (ResNet) as the basic building block for dynamics recovery. The ResNet block can be considered as an exact one-step integration for autonomous ODE systems. This framework is further generalized to recover systems with stochastic and time-dependent inputs. In the special situation where the data come from a Hamiltonian system, a recent progress on learning the Hamiltonian from data will be discussed. This approach not only recovers the hidden Hamiltonian system, but also preserves the Hamiltonian, which is the total energy in many physical applications, along the solution trajectory.

 

Skip to toolbar