Mathematics of Data Science Seminar meets (usually) on Tuesdays 11:30-12:30, in MW 154. The schedule is below. If you would like to speak in the seminar, please contact Maria Han Veiga (hanveiga.1@osu.edu) or Vladimir Kobzar (kobzar.1@osu.edu).
Date | Speaker | Title |
Fall 2024 | ||
September 10, 11:30 | Bernardo Modenesi (University of Michigan) | Unveiling Hidden Patterns in Agent Behavior with Discrete-Choice and Network Theory |
October 17, 16:00 (Thursday) | Tim Kunisky (John Hopkins University) | Spectral pseudorandomness, free probability, and the clique number of the Paley graph |
November 19, 11:30 | Thomas O'Leary-Roseberry (UT Austin) | Efficient Infinite-Dimensional Bayesian Inversion using Derivative-Informed Neural Operators |
Spring 2025 | ||
March 25, 11:30 | Nguyen-Truc-Dao Nguyen (San Diego State University) | Optimization of Model Predictive Control using iLQR and Neural Network |
April 17, 11:30 | Paul Laiu (Oak Ridge National Lab) | Accelerated and communication efficient federated learning algorithms |
April 22, 11:30 | Vladimir Kobzar (Ohio State University) | Fourier-Based Bounds for Wasserstein Distances and Their Applications in Data-Driven Problems |
Title: Unveiling Hidden Patterns in Agent Behavior with Discrete-Choice and Network Theory
Abstract: Many datasets in data science stem from agents making repeated choices over time, with each choice leading to an observable outcome. In this setup, we introduce a novel approach to uncover latent agent heterogeneity, enhancing our understanding of agent behavior and improving causal inference estimation. By combining discrete choice models with network theory, we develop a method to measure agent similarity based on their patterns of choice. This results in a network-based unsupervised clustering technique that groups agents with similar behaviors, offering an interpretable alternative to black-box machine learning clustering models, with explicit estimation assumptions. In this seminar, I will illustrate our approach using labor market data, where workers (agents) and jobs (choices) are represented as nodes in a bipartite network, with edges in this network denoting worker-job matches. By clustering workers based on their job choices, we can infer unobserved workers’ skills—an important factor in economic analysis. Through Bayesian estimation, we reveal latent groups of similar workers, which is used to make more accurate predictions of labor market outcomes and measurement of labor market discrimination, compared to models relying only on observable characteristics. This seminar will detail our methodological framework, estimation strategy, and practical applications for understanding and predicting agent-choice dynamics.
Title: Spectral pseudorandomness, free probability, and the clique number of the Paley graph
Abstract: The Paley graph is a classical number-theoretic construction of a graph that is believed to behave "pseudorandomly" in many regards. Accurately bounding the clique number of the Paley graph is a long-standing open problem in number theory, with applications to several other questions about the statistics of finite fields. I will present a new approach to this problem, which also opens up intriguing connections with random matrix theory and free probability. In particular, I will show that certain deterministic induced subgraphs of the Paley graph have the same limiting spectrum as induced subgraphs on random subsets of vertices of the same size. I will discuss how this phenomenon arises as a consequence of asymptotic freeness (in the sense of free probability) of certain matrices associated with the Paley graph. I will then present conjectures describing a stronger analogy between random and pseudorandom deterministic induced subgraphs that would lead to clique number bounds improving on the state of the art. On the way, I will describe new techniques for understanding the eigenvalue statistics of more general random or pseudorandom submatrices of certain structured matrices like ones associated to incoherent tight frames, and will mention how these helped to resolve a recent conjecture of Haikin, Zamir, and Gavish in frame theory.
Title: Efficient Infinite-Dimensional Bayesian Inversion using Derivative-Informed Neural Operators
Abstract: We address Bayesian inverse problems (BIPs) for functions that parametrize PDE models, such as heterogeneous coefficient fields. These problems are challenging due to high-dimensional parameter spaces upon discretization, the computational demands of PDE-based likelihood evaluations, and the complex geometry of the posterior (e.g., concentration and non-linear multi-modality). While efficient algorithms for infinite-dimensional BIPs leverage posterior geometry through likelihood derivatives, they become impractical when the computational model (and its adjoints) is expensive to compute. In this talk, we introduce derivative-informed neural operators (DINOs), a class of neural operator surrogates trained to accurately approximate both PDE mappings and their derivatives (Fréchet derivatives). DINOs offer superior generalization over traditional neural operators, enabling precise derivative use in high-dimensional uncertainty quantification (UQ) tasks like BIPs. Specifically, reduced basis DINOs are a natural choice for algorithms that employ subspace decompositions. In the context of BIPs, these algorithms allowing for efficient inversion in a likelihood-dominated subspace, with the prior completing the orthogonal component. We will explore these algorithms within the frameworks of geometric MCMC and transport map variational inference. Our numerical results show that DINO-accelerated algorithms achieve state-of-the-art cost-accuracy performance compared to various other methods.
Title: Optimization of Model Predictive Control using iLQR and Neural Network
Abstract: The talk is devoted to using the method of discrete approximation to derive necessary optimality conditions in model predictive control problem for fully controlled constrained sweeping processes and applications to several practical dynamical models.
Title: Accelerated and communication efficient federated learning algorithms.
Abstract: Federated learning (FL) is a distributed machine learning paradigm that enables multiple local devices, i.e., clients, and a central server to collaboratively train a machine learning model from local data distributed on clients without moving the data. Common challenges in FL include data privacy preservation, communication efficiency, and convergence guarantees. In this talk, we present two new FL algorithms, FedOSAA and FeDLRT. FedOSAA leverages a widely used acceleration scheme, Anderson acceleration, to accelerate the convergence of standard FL schemes with minimal additional communication costs. FeDLRT adopts a dynamical low-rank training scheme to constrain the model training on low rank factors of model parameters, resulting in savings in both the communication and memory costs in the training process. By incorporating a variance reduction scheme, we establish convergence guarantees for both FL algorithms, even when the data distribution is heterogeneous (non-IID) among the local clients. The advantages of these two FL algorithms are demonstrated in numerical tests on machine learning problems ranging from logistic regression to training transformer models in an FL setting.
Title: Fourier-Based Bounds for Wasserstein Distances and Their Applications in Data-Driven Problems
Abstract: Machine learning, inverse problems and other data-driven problems entail fitting a mathematical model to data. These problems are often solved numerically, by minimizing the mismatch between the model and the data using an appropriate metric. We focus on the case when this metric is the Wasserstein-p (W_p) distance between probability measures as well as its generalizations by Piccoli et al., for unbalanced measures, including the Kantorovich-Rubinstein norm. The recent work of Niles-Weed and Berthet established that W_p is bounded from below and above by weighted \ell_p norms of the wavelet coefficients of the mismatch, among other things, relying on the fluid dynamics formulation of W_p. Building on this research, we establish lower and upper bounds on W_p on the hypercube and flat torus in terms of a weighted \ell_q norms of the Fourier coefficients of the mismatch. In this setting, for measures uniformly bounded above, the lower bound increases as p increases. Based on that fact, in our setting, the lower bound resolves the open problem posed by Steinerberger to prove the existence of a Fourier-based lower bound on W_p that grows with p. When W_p is used as the mismatch metric to fit a mathematical model to data, these bounds allow us to analyze the effects of stopping early the computational minimization of the mismatch on the resolution of frequencies, and the dependence of the resolution of frequencies on p. This is joint work with Wanli Hong at NYU and Kui Ren at Columbia.