CCMA Seminar on Mathematics of Data and Computation
This CCMA seminar aims to introduce cutting-edge research of numerical methods and scientific computing related to data science and machine learning and are held weekly on every Thursday 8:00-9:00 pm ET (Friday 9:00-10:00 am Beijing time). If you want to give a talk, please contact Xiaofeng Xu(xkx5060@psu.edu).
(Due to the COVID-19, we move all the seminars online, ZOOM: https://psu.zoom.us/j/99704506940)
Past Seminars:
Maximum bound principle preserving integrating factor Runge–Kutta methods for semilinear parabolic equations
- Zhonghua Qiao, The Hong Kong Polytechnic University
- Time: 8:00 pm – 9:00 pm, Thursday, January 27, 2022 (ET).
- Online Zoom Link: https://psu.zoom.us/j/99704506940
- Abstract: A large class of semilinear parabolic equations satisfy the maximum bound principle (MBP) in the sense that the time-dependent solution preserves for any time a uniform pointwise bound imposed by its initial and boundary conditions. The MBP plays a crucial role in understanding the physical meaning and the wellposedness of the mathematical model. Investigation on numerical algorithms with preservation of the MBP has attracted increasingly attentions in recent years, especially for the temporal discretizations since the violation of MBP may lead to nonphysical solutions or even blow-ups of the algorithms. In this work, we study high-order MBP-preserving time integration schemes by means of the integrating factor Runge–Kutta (IFRK) method. Beginning with the space-discrete system of semilinear parabolic equations, we present the IFRK method in general form and derive sufficient conditions for the method to preserve the MBP. In particular, we show that the classic four-stage, fourth-order IFRK scheme is MBP preserving for some typical semilinear systems although not strong stability preserving, which can be instantly applied to the Allen–Cahn type of equations.
Additive Schwarz methods for convex optimization—convergence theory and acceleration
- Jongho Park, Natural Science Research Institute, KAIST, Korea
- Time: 8:00 pm – 9:00 pm, Thursday, January 20, 2022 (ET).
- Online Zoom Link: https://psu.zoom.us/j/99704506940
- Abstract: The purpose of this talk is to present a summary on notable recent results on additive Schwarz methods for general convex optimization. Based on a generalized additive Schwarz lemma that says that additive Schwarz methods for convex optimization can be interpreted as nonlinear gradient methods, I present two interesting results: an abstract convergence theory and an acceleration scheme for additive Schwarz methods. Both of these results can be applied to a very broad range of convex optimization problems containing nonlinear elliptic problems, variational inequalities, and mathematical imaging problems.
Phase-Field Modeling of Vesicle Motion, Deformation and Mass Transfer
- Shixin Xu, Duke Kunshan University
- Time: 9:00 pm – 10:00 pm, Thursday, September 30, 2021 (ET).
- Online Zoom Link: https://psu.zoom.us/j/97489448314
- Abstract: In this talk, a thermodynamically consistent phase-field model is first introduced for simulating motion and shape transformation of vesicles under flow conditions. In particular, a general slip boundary condition is used to describe the interaction between vesicles and the wall of the fluid domain. A second-order accurate in both space and time $C^0$ finite element method is proposed to solve the model governing equations. Various numerical tests confirm the convergence, energy stability, and conservation of mass and surface area of cells of the proposed scheme. Vesicles with different mechanical properties are also used to explain the pathological risk for patients with sickle cell disease. Then a thermal-dynamical consistent model for mass transfer across permeable moving interfaces is proposed by using the energy variation method. We consider a restricted diffusion problem where the flux across the interface depends on its conductance and the difference of the concentration on each side. These are collaborated works with Lingyue Shen, Yuzhe Qin, Zhiliang Xu, Ping Lin and Huaxiong Huang.
Predicting High-Stress Regions Inside Microstructures Using Deep Learning
- Ankit Shrivastava, Carnegie Mellon University
- Time: 9:00 pm – 10:00 pm, Thursday, September 16, 2021 (ET).
- Online Zoom Link: https://psu.zoom.us/j/97489448314
- Abstract: In this talk, I will present a deep learning-based Convolutional Encoder-Decoder model to predict the characteristics of high-stress clusters (stress hotspots) inside microstructures. Further, we analyze the trained model to understand the microstructure features causing these stress hotspots. A microstructure is a small-scale structure observed at the microscale inside a polycrystalline material. It is composed of multiple grains, where each grain represents an arrangement of atomic crystals in a particular orientation. The microstructures amplify the applied load at some points inside polycrystalline materials, causing hotspots. The hotspots initiate non-linear phenomena such as fracture and yielding; hence it is essential to know their characteristics such as location, size, and orientation. However, there is no known analytical/statistical relationship between the microstructure features and the hotspots characteristics.
Numerical schemes such as finite element and Fourier-based methods can obtain highly accurate stress fields in microstructures. However, simulating many microstructures and then manually observing the relationship between the hotspots and the microstructure features is complex and can suffer from user bias. Statistical models can help to infer such relationships using the data generated from these numerical schemes.
The hotspots inside the stress fields obtained from the numerical simulations occupy a small spatial volume relative to the entire domain. To make any statistical model learn such sparse hotspots will require a large amount of data causing computationally expensive training. Moreover, there exist local-spatial relations between the microstructures and the stress fields. Models such as regression and deep neural networks cannot capture such spatial relationships. Based on convolutional filters, the proposed Encoder-Decoder model can capture local-spatial relations using spatially weighted averaging operations.
The model was trained against linear elastic calculations of stress under uniaxial strain in synthetically generated microstructures. The model prediction accuracy was analyzed using the cosine similarity and by comparing peak stress clusters’ geometric characteristics. The average cosine similarity on the test set is around 0.95, which is very close to 1.0 (perfect prediction). Comparison of geometric characteristics showed that the model could predict the location and size of the stress hotspots. Further, we visualized the feature maps and saliency maps to understand the relationship between the microstructure features and the hotspot characteristics. Through feature map visualization, we observed that different convolutional filters in the first layer capture different but distinguishable local features, such as grains and grain boundaries in a microstructure. The behavior of these filters around the hotspots can be used to understand the relation between the hotspots and the microstructure features. Using the saliency map visualization, we identified some long-range effects inside the microstructures.
AI-empowered Performance-based Wind Engineering
- Teng Wu, University at Buffalo
- Time: 9:00 pm – 10:00 pm, Thursday, September 9, 2021 (ET).
- Online Zoom Link: https://psu.zoom.us/j/97489448314
- Abstract: Recent advancements in performance-based wind engineering have placed new demands on wind characterization (e.g., duration consideration), aerodynamics modeling (e.g., transient feature) and structural analysis (e.g., nonlinear response). While conventional approaches in computational and experimental wind engineering provide valuable tools to overcome many of these emerging challenges, noticeable increase in use of artificial intelligence (AI) suggests its great promise in facilitating the implementation of performance-based wind design methodology. This talk will discuss state-of-the-art machine learning tools (e.g., knowledge-enhanced deep learning and deep reinforcement learning) that are successfully applied to wind climate analysis, transient aerodynamics, nonlinear structural dynamics, shape optimization and vibration control.
Nonlinear simulation of vascular tumor growth with chemotaxis and the control of necrosis
- Min-Jhe Lu, Illinois Institute of Technology
- Time: 9:00 pm – 10:00 pm, Thursday, September 2, 2021 (ET).
- Online Zoom Link: https://psu.zoom.us/j/97489448314
- Abstract: In this work, we develop a sharp interface tumor growth model to study the effect of both the intratumoral structure using a controlled necrotic core and the extratumoral nutrient supply from vasculature on tumor morphology. We first show that our model extends the benchmark results in the literature using linear stability analysis. Then we solve this generalized model numerically using a spectrally accurate boundary integral method in an evolving annular domain, not only with a Robin boundary condition on the outer boundary for the nutrient field which models tumor vasculature, but also with a static boundary condition on the inner boundary for pressure field which models the control of tumor necrosis. The discretized linear systems for both pressure and nutrient fields are shown to be well-conditioned through tracing GMRES iteration numbers. Our nonlinear simulations reveal the stabilizing effects of angiogenesis and the destabilizing ones of chemotaxis and necrosis in the development of tumor morphological instabilities if the necrotic core is fixed in a circular shape. When the necrotic core is controlled in a non-circular shape, the stabilizing effects of proliferation and the destabilizing ones of apoptosis are observed. Finally, the values of the nutrient concentration with its fluxes and the pressure level with its normal derivatives, which are solved accurately at the boundaries, help us to characterize the corresponding tumor morphology and the level of the biophysical quantities on interfaces required in keeping various shapes of the necrotic region of the tumor. Interestingly, we notice that when the necrotic region is fixed in a 3-fold non-circular shape, even if the initial shape of the tumor is circular, the tumor will evolve into a shape corresponding to the 3-fold symmetry of the shape of the fixed necrotic region. This is a joint work with Professor Wenrui Hao, Chun Liu, John Lowengrub and Shuwang Li.
Monte Carlo methods for the Hermitian eigenvalue problem
- Robert J. Webber, Courant Institute of Mathematical Sciences
- Time: 10:30am – 11:30am, Wednesday, Dec.30, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/92211837073
- Abstract: In quantum mechanics and the analysis of Markov processes, Monte Carlo methods are needed to identify low-lying eigenfunctions of dynamical generators. The standard Monte Carlo approaches for identifying eigenfunctions, however, can be inaccurate or slow to converge. What limits the efficiency of the currently available spectral estimation methods and what is needed to build more efficient methods for the future? Through numerical analysis and computational examples, we begin to answer these questions. We present the first-ever convergence proof and error bounds for the variational approach to conformational dynamics (VAC), the dominant method for estimating eigenfunctions used in biochemistry. Additionally, we analyze and optimize variational Monte Carlo (VMC), which combines Monte Carlo with neural networks to accurately identify low-lying eigenstates of quantum systems.
A New Finite Element Approach for Nonlinear Eigenvalue Problems
- Prof.Jiguang Sun, Michigan Technological University
- Time: 8:30pm – 9:30pm, Thursday, Dec.03, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/92211837073
- Abstract: We propose a new finite element approach, which is different from the classic Babuska-Osborn theory, for some nonlinear eigenvalue problems. The eigenvalue problem is formulated as the eigenvalue problem of a holomorphic Fredholm operator function of index zero. Finite element methods are used for discretization. The convergence of eigenvalues/eigenvectors is proved using the abstract approximation theory for holomorphic operator functions. Then the spectral indicator method is extended to compute the eigenvalues/eigenvectors. Two nonlinear eigenvalue problems are treated using the proposed approach.
Convergence Analysis of Neural Networks for Solving a Free Boundary Problem
- Xinyue Zhao, University of Notre Dame
- Time: 8:30pm – 9:30pm, Thursday, Nov 19, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/92211837073
- Abstract: Free boundary problems deal with systems of partial differential equations, where the domain boundary is apriori unknown. Due to this special character- istic, it is challenging to solve the free boundary problems either theoretically or numerically. In this talk, I will present a novel approach for solving a modified Hele-Shaw problem based on the neural network discretization and the bound- ary integral method. The existence of the numerical solution with this method is established theoretically. In the simulations, we first verify this approach by computing the symmetry-breaking solutions that are guided by the bifurcation analysis near the radially-symmetric branch. Moreover, we further verify the capability of this approach by computing some non-radially symmetric solutions which are not characterized by any theorems.
Extreme Learning Machines (ELM) – When ELM and Deep Learning Synergize
- Guangbin Huang, Nanyang Technological University, Singapore
- Time: 8:30pm – 9:30pm, Thursday, Nov 12, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/92211837073
- Abstract: One of the most curious in the world is how brains produce intelligence. The objectives of this talk are three-folds: 1) There exists some convergence between machine learning and biological learning. Although there exist many different types of techniques for machine learning and also many different types of learning mechanism in brains, Extreme Learning Machines (ELM) as a common learning mechanism may fill in the gap between machine learning and biological learning. In fact, ELM theories have been validated by more and more direct biological evidences recently. ELM theories point out that the secret learning capabilities of brains may be due to the globally ordered and structured mechanism but with locally random individual neurons in brains, and such a learning system happens to have regression, classification, sparse coding, clustering, compression and feature learning capabilities, which are fundamental to cognition and reasoning; 2) Single hidden layer of ELM unifies Support Vector Machines (SVM), Principal Component Analysis (PCA), Non-negative Matrix Factorization (NMF); 3) ELM provides some theoretical support to the universal approximation and classification capabilities of Convolutional Neural Networks (CNN). In addition to the good performance in small to medium datasets, hierarchical ELM is catching up with Deep Learning in some benchmark big datasets which Deep Learning used to perform well.
On solving elliptic interface problems with fourth order accuracy and FFT efficiency
- Shan Zhao, University of Alabama
- Time: 8:30pm – 9:30pm, Thursday, Nov 05, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/92211837073
- Abstract: In this talk, we will introduce an augmented matched interface and boundary (AMIB) method for solving elliptic interface and boundary value problems. The AMIB method provides a uniform fictitious domain approach to correct the fourth order central difference near interfaces and boundaries. In treating a smoothly curved interface, zeroth and first order jump conditions are enforced repeatedly to generate fictitious values surrounding the interface. Different types of boundary conditions, including Dirichlet, Neumann, Robin and their mix combinations, can be imposed to generate fictitious values outside boundaries. Based on fictitious values at interfaces and boundaries, the AMIB method reconstructs Cartesian derivative jumps as additional unknowns and forms an enlarged linear system. In the Schur complement solution of such system, the FFT algorithm will not sense the solution discontinuities, so that the discrete Laplacian can be efficiently inverted. In our 2D tests, the FFT-based AMIB not only achieves a fourth order convergence in dealing with interfaces and boundaries, but also produces an overall complexity of O(n^2 log n) for a n-by-n uniform grid. Moreover, the AMIB method can provide fourth order accurate approximations to solution gradients and fluxes.
Flow dynamic approach and Lagrange multiplier approach for gradient flow
- Qing Cheng, Illinois Institute of Technology
- Time: 8:30pm – 9:30pm, Thursday, July 30, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: In this talk, I will introduce a new Lagrangian approach-flow dynamic approach to effectively capture the interface for phase field models. Its main advantage, comparing with numerical methods in Eulerian coordinate is that thin interfaces can be effectively captured with few points in Lagrangian coordinate. Meanwhile I will also introduce the SAV and Lagrange multiplier approach which preserve energy dissipative and physical constraints for gradient systems in discrete level. The advantage of these methods only require solving linear equation with constant coefficients at each time step plus an additional nonlinear algebraic system which can be solved at negligible cost. Ample numerical results for phase field models are presented to validate the effectiveness and accuracy of the proposed numerical schemes.
Bayesian Sparse learning with preconditioned stochastic gradient MCMC and its applications
- Guang Lin, Purdue University
- Time: 8:30pm – 9:30pm, Thursday, July 23, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: Deep neural networks have been successfully employed in an extensive variety of research areas, including solving partial differential equations. Despite its significant success, there are some challenges in effectively training DNN, such as avoiding over-fitting in over-parameterized DNNs and accelerating the optimization in DNNs with pathological curvature. In this work, we propose a Bayesian type sparse deep leaning algorithm. The algorithm utilizes a set of spike-and-slab priors for the parameters in deep neural network. The hierarchical Bayesian mixture will be trained using an adaptive empirical method. That is, one will alternatively sample from the posterior using appropriate stochastic gradient Markov Chain Monte Carlo method (SG-MCMC), and optimize the latent variables using stochastic approximation. The sparsity of the network is achieved while optimizing the hyperparameters with adaptive searching and penalizing. A popular SG-MCMC approach is Stochastic gradient Langevin dynamics (SGLD). However, considering the complex geometry in the model parameter space in non-convex learning, updating parameters using a universal step size in each component as in SGLD may cause slow mixing. To address this issue, we apply computational manageable preconditioner in the updating rule, which provides step size adapt to local geometric properties. Moreover, by smoothly optimizing the hyperparameter in the preconditioning matrix, our proposed algorithm ensures a decreasing bias, which is introduced by ignoring the correction term in preconditioned SGLD. According to existing theoretical framework, we show that the proposed method can asymptotically converge to the correct distribution with a controllable bias under mild conditions. Numerical tests are performed on both synthetic regression problems and learning the solutions of elliptic PDE, which demonstrate the accuracy and efficiency of present work.
Sparse Machine Learning in a Banach Space
- Yuesheng Xu, Old Dominion University
- Time: 8:30pm – 9:30pm, Tuedsay July 21, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: We report in this talk recent development of kernel based machine learning. We first review a basic classical problem in machine learning- classification, from which we introduce kernel based machine learning methods. We discuss two fundamental problems in kernel based machine learning: representer theorems and kernel universality. We then elaborate recent exciting advances in sparse learning. In particular, we discuss the notion of reproducing kernel Banach spaces and learning in a Banach space.
Reduced-order Deep Learning for Flow Dynamics. The Interplay between Deep Learning and Model Reduction
- Min Wang, Duke University
- Time: 8:30pm – 9:30pm, Thursday, July 16, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: In this work, we investigate neural networks applied to multiscale simulations of porous media flows taking into account the observed fine data and physical modeling concepts. In addition, a design of a novel deep neural network model reduction approach for multiscale problems is discussed. Our approaches use deep learning techniques combined with local multiscale model reduction methodologies to predict flow dynamics. Constructing deep learning architectures using a reduced-order models can benefit its robustness since such a model has fewer degrees of freedom. Moreover, numerical results show that using deep learning with data generated from multiscale models as well as available observed fine data, we can obtain an improved forward map which can better approximate the fine scale model. More precisely, the solution (e.g., pressures and saturation) at the time instant n+1 is approximated by a neural network taking the solution at the time instant n and parameters, such as permeability fields, forcing terms, and initial conditions as its inputs. We further study the features of the coarse-grid solutions that neural networks capture via relating the input-output optimization to $l_1$ minimization of flow solutions. In proposed multi-layer networks, we can learn the forward operators in a reduced way without computing them as in POD like approaches. We present soft thresholding operators as activation function, which promotes sparsity and can be further utilized to find underlying low-rank structures of the data. With these activation functions, the neural network identifies and selects important multiscale features which are crucial in modeling the underlying flow. Using trained neural network approximation of the input-output map, we construct a reduced-order model.
Computational Redundancy in Deep Neural Networks
- Gao Huang, Tsinghua University
- Time: 8:30pm – 9:30pm, Thursday, June 25, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: Deep learning has gained great popularity in computer vision, natural language processing, robotics, etc. However, deep models are often been criticized as being cumbersome and energy inefficient. This talk will first demonstrate that deep networks are overparameterized models – although they have millions (or billions) of parameters, they may not use them effectively. High redundancy seems to be helpful in terms of generalization, while it introduces high computational burden to real systems. This talk will introduce algorithms and architecture innovations that help us understand the redundancy in deep models, and eventually reduce unnecessary computation for efficient deep learning.
Understanding Deep Learning via Analyzing Trajectories of Gradient Descent
- Wei Hu, Princeton University
- Time: 8:30pm – 9:30pm, Thursday, June 18, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: Deep learning builds upon the mysterious abilities of gradient-based optimization algorithms. Not only can these algorithms often achieve low loss on complicated non-convex training objectives, but the solutions found can also generalize remarkably well on unseen test data. Towards explaining these mysteries, I will present some recent results that take into account the trajectories taken by the gradient descent algorithm — the trajectories turn out to exhibit special properties that enable the successes of optimization and generalization.
Statistical Method for Selecting Best Treatment with High-dimensional Covariates
- Xiaohua Zhou, Peking University
- Time: 8:30pm – 9:30pm, Thursday, June 11, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: In this talk, I will introduce a new semi-parametric modeling method for heterogeneous treatment effect estimation and individualized treatment selection using a covariate-specific treatment effect (CSTE) curve with high-dimensional covariates. The proposed method is quite flexible to depict both local and global associations between the treatment and baseline covariates, and thus is robust against model mis-specification in the presence of high-dimensional covariates. We also establish the theoretical properties of our proposed procedure. I will further illustrate the performance of the proposed method by simulation studies and analysis of a real data example. This is a joint work with Drs. Guo and Ma at University of California at Riverside.
Homotopy training algorithm for neural networks and applications in solving nonlinear PDE
- Wenrui Hao, Pennsylvania State University
- Time: 8:30pm – 9:30pm, Thursday, May 21, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: In this talk, I will introduce two different topics related to neural networks. The first one is a homotopy training algorithm that is designed to solve the nonlinear optimization problem of machine learning via building the neural network adaptively. The second topic is a randomized Newton’s method that is used to solve nonlinear systems arising from the neural network discretization of differential equations. Several examples are used to demonstrate the feasibility and efficiency of two proposed methods.
From ODE solvers to accelerated first-order methods for convex optimization
- Long Chen, University of California, Irvine
- Time: 8:30pm – 9:30pm, Thursday, May 14, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: Convergence analysis of accelerated first-order methods for convex optimization problems are presented from the point of view of ordinary differential equation (ODE) solvers. We first take another look at the acceleration phenomenon via A-stability theory for ODE solvers and present a revealing spectrum analysis for quadratic programming. After that, we present the Lyapunov framework for dynamical system and introduce the strong Lyapunov condition. Many existing continuous convex optimization models, such as gradient flow, heavy ball system, Nesterov accelerated gradient flow, and dynamical inertial Newton system etc, are addressed and analyzed in this framework. Then we present convergence analyses of optimization algorithms obtained from implicit or explicit methods of underlying dynamical systems. This is a joint work with Hao Luo from Sichuan University
The Geometry of Functional Spaces of Neural Networks
- Matthew Trager, Courant Institute at NYU
- Time: 8:30pm – 9:30pm, Thursday, May 7, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: The reasons behind the empirical success of neural networks are not well understood. One important characteristic of modern deep learning architectures compared to other large-scale parametric learning models is that they identify a class of functions that is non-linear, but rather has a complex hierarchical structure. Furthermore, neural networks are non-identifiable models, in the sense that different parameters may yield the same function. Both of these aspects come into play significantly when optimizing an empirical risk in classification or regression tasks.
In this talk, I will present some of my recent work that studies the functional space associated with neural networks with linear, polynomial, and ReLU activations, using ideas from algebraic and differential geometry. In particular, I will emphasize the distinction between the intrinsic function space and its parameterization, in order to shed light on the impact of the architecture on the expressivity of a model and on the corresponding optimization landscapes.
Interpreting Deep Learning Models: Flip Points and Homotopy Methods
- Roozbeh Yousefzadeh, Yale University
- Time: 8:30 pm – 9:30 pm, April 30 , 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: This talk concerns methods for studying deep learning models and interpreting their outputs and their functional behavior. A trained model (e.g., a neural network), is a function that maps inputs to outputs. Deep learning has shown great success in performing different machine learning tasks; however, these models are complicated mathematical functions, and their interpretation remains a challenging research question. We formulate and solve optimization problems to answer questions about the model and its outputs. Specifically, we study the decision boundaries of a model using flip points. A flip point is any point that lies on the boundary between two output classes: e.g. for a neural network with a binary yes/no output, a flip point is any input that generates equal scores for “yes” and “no”. The flip point closest to a given input is of particular importance, and this point is the solution to a well-posed optimization problem. To compute the closest flip point, we develop a homotopy algorithm for neural networks that transforms the deep learning function in order to overcome the issues of vanishing and exploding gradients. We show that computing closest flip points allows us to systematically investigate the model, identify decision boundaries, interpret and audit the model with respect to individual inputs and entire datasets, and find vulnerability against adversarial attacks. We demonstrate that flip points can help identify mistakes made by a model, improve the model’s accuracy, and reveal the most influential features for classifications.
Partial Differential Equation Principled Trustworthy Deep Learning
- Bao Wang, University of California, Los Angeles
- Time: 9:00am – 10:30am, Apr. 24th, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: This talk contains two parts: In the first part, I will present some recent work on developing partial differential equation principled robust neural architecture and optimization algorithms for robust, accurate, private, and efficient deep learning. In the second part, I will discuss some recent progress on leveraging Nesterov accelerated gradient style momentum for accelerating deep learning, which again involves designing stochastic optimization algorithms and mathematically principled neural architecture.
Topology Data Analysis (TDA) Based
Machine Learning Models for Drug Design
- Kelin Xia, Nanyang Technological University
- Time: 9:00am – 10:30am, Apr. 17th, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: In this talk, I will talk about topological data analysis (TDA) and its application in biomolecular data analysis, in particular, drug design. Persistent homology, which is one of the most important tools in TDA, is used in identification, classification and analysis of biomolecular structure, flexibility, dynamics, and functions. Properties from persistent homology analysis are used as features for learning models. Unlike previous biomolecular descriptors, topological features from TDA, provide a balance between structural complexity and data simplification. TDA-based learning models have consistently delivered the best results in various aspects of drug design, including protein-ligand binding affinity prediction, solubility prediction, protein stability change upon mutation prediction, toxicity prediction, solvation free energy prediction, partition coefficient and aqueous solubility, binding pocket detection, and drug discovery. Further, I will discuss our recently-proposed persistent spectral based machine learning (PerSpect ML) models. Different from all previous spectral models, a filtration process is introduced to generate multiscale spectral models. Persistent spectral variables are defined as the function of spectral variables over the filtration value. We test our models on the most commonly-used databases, including PDBbind-2007, PDBbind-2013, and PDBbind-2016. Our results are better than all existing models, for all these databases, as far as we know. This demonstrates the great power of our PerSpect ML in molecular data analysis and drug design.
Learning Dynamical Systems from Data
- Tong Qin, Ohio State University
- Time: 9:00am – 10:30am, Apr. 10th, 2020 (ET).
- Online Zoom Link: https://psu.zoom.us/j/560094163
- Abstract: The recent advancement of computing resources and the accumulation of gigantic amount of data give rise to an explosive growth in the development of data-driven modeling, which is becoming as equally important as the traditional first principles based modeling. In this talk, I will introduce our recent work on designing artificial neural networks for approximating the governing equation in the equation-free input-output mapping form. The dynamical system can be unknown physical models, or semi-discretized PDEs. Basing on the one-step integral form of the ODEs, we propose to use the residual network (ResNet) as the basic building block for dynamics recovery. The ResNet block can be considered as an exact one-step integration for autonomous ODE systems. This framework is further generalized to recover systems with stochastic and time-dependent inputs. In the special situation where the data come from a Hamiltonian system, a recent progress on learning the Hamiltonian from data will be discussed. This approach not only recovers the hidden Hamiltonian system, but also preserves the Hamiltonian, which is the total energy in many physical applications, along the solution trajectory.