Mathematics of Data Science Seminar meets (usually) on **Tuesdays 11:30-12:30**, in **MW 154**. The schedule is below. If you would like to speak in the seminar, please contact Maria Han Veiga (hanveiga.1@osu.edu) or Vladimir Kobzar (kobzar.1@osu.edu).

Date |
Speaker |
Title |

Fall 2024 |
||

September 10, 11:30 | Bernardo Modenesi (University of Michigan) | Unveiling Hidden Patterns in Agent Behavior with Discrete-Choice and Network Theory |

October 17, 16:00 (Thursday) | Tim Kunisky (John Hopkins University) | Spectral pseudorandomness, free probability, and the clique number of the Paley graph |

November 17, 11:30 | Thomas-O'Leary-Roseberry (UT Austin) | TBC |

- Bernardo Modenesi, September 10.
Title: Unveiling Hidden Patterns in Agent Behavior with Discrete-Choice and Network Theory

Abstract: Many datasets in data science stem from agents making repeated choices over time, with each choice leading to an observable outcome. In this setup, we introduce a novel approach to uncover latent agent heterogeneity, enhancing our understanding of agent behavior and improving causal inference estimation. By combining discrete choice models with network theory, we develop a method to measure agent similarity based on their patterns of choice. This results in a network-based unsupervised clustering technique that groups agents with similar behaviors, offering an interpretable alternative to black-box machine learning clustering models, with explicit estimation assumptions. In this seminar, I will illustrate our approach using labor market data, where workers (agents) and jobs (choices) are represented as nodes in a bipartite network, with edges in this network denoting worker-job matches. By clustering workers based on their job choices, we can infer unobserved workersâ€™ skillsâ€”an important factor in economic analysis. Through Bayesian estimation, we reveal latent groups of similar workers, which is used to make more accurate predictions of labor market outcomes and measurement of labor market discrimination, compared to models relying only on observable characteristics. This seminar will detail our methodological framework, estimation strategy, and practical applications for understanding and predicting agent-choice dynamics.

- Tim Kunisky, October 17.
Title: Spectral pseudorandomness, free probability, and the clique number of the Paley graph

Abstract: The Paley graph is a classical number-theoretic construction of a graph that is believed to behave "pseudorandomly" in many regards. Accurately bounding the clique number of the Paley graph is a long-standing open problem in number theory, with applications to several other questions about the statistics of finite fields. I will present a new approach to this problem, which also opens up intriguing connections with random matrix theory and free probability. In particular, I will show that certain deterministic induced subgraphs of the Paley graph have the same limiting spectrum as induced subgraphs on random subsets of vertices of the same size. I will discuss how this phenomenon arises as a consequence of asymptotic freeness (in the sense of free probability) of certain matrices associated with the Paley graph. I will then present conjectures describing a stronger analogy between random and pseudorandom deterministic induced subgraphs that would lead to clique number bounds improving on the state of the art. On the way, I will describe new techniques for understanding the eigenvalue statistics of more general random or pseudorandom submatrices of certain structured matrices like ones associated to incoherent tight frames, and will mention how these helped to resolve a recent conjecture of Haikin, Zamir, and Gavish in frame theory.

- Thomas-O'Leary-Roseberry, November 17.
Title:

Abstract: