leiyo@dtu.dk
Technical University of Denmark (DTU)
I received my Ph.D. in Computer Science (specializing in Mathematical Optimization) from the Department of Information Technology at Uppsala University in 2019. During the PhD, I interned as a visiting data scientist at The Boston Consulting Group (BCG) Gamma. After the PhD, I worked as a data scientist in Bolt and Wolt (Doordash) in the domain of on-demand logistics optimization. I am a co-founder of https://cspaper.org.
I develop mathematical foundations for trustworthy compound AI systems. My work asks how AI ecosystems remain reliable when components are blocked, replaced, routed, or verification-limited. I study these questions through counterfactual information design: information laws for model replacement, verification bottlenecks, and system-level audit. Methodologically, I combine information theory, counterfactual reasoning, optimization, and AI systems.
(See a full list of publications here)
Modern AI systems are compound systems of models, routers, retrievers, tools, verifiers, memories, and human review. Trustworthiness is therefore not only a property of one model. It is a property of information relations among components.
My research develops information laws for trustworthy compound AI systems:
What can be replaced?
What must be verified?
What information must remain available for the system to stay trustworthy?
I study AI systems as ecosystems of interacting components. The central question is not only whether one model performs well, but whether its behavior can be reproduced, repaired, or verified by the remaining system under explicit information constraints. This line includes matched in-silico quasi-experimental design, DISCO/PIER audits, and Minimum Viable Replacement: the information cost of replacing a blocked model.
L. You ✉, "Quantifying Model Uniqueness in Heterogeneous AI Ecosystems", preprint. [OpenPrint] [code]
L. You ✉, L. Cao, M. Nilsson, B. Zhao, and L. Lei, "Distributional Counterfactual Explanation With Optimal Transport", International Conference on Artificial Intelligence and Statistics (AISTATS) 2025 (Oral, top 2%). [arXiv] [code]
L. You ✉ and H. V. Cheng ✉, "SWAP: Sparse Entropic Wasserstein Regression for Robust Network Pruning", International Conference on Learning Representations (ICLR) 2024. [arXiv] [code]
AI systems can now generate candidates, reviews, claims, code patches, and hypotheses at scale. The bottleneck is verification. This line studies how scarce human or automated checks turn untrusted candidate information into reliable knowledge, with applications to peer review, scientific assessment, RAG systems, and agentic workflows. I co-founded CSPaper (https://cspaper.org) as an open research infrastructure that supports end-to-end experimentation and evaluation for this agenda.
L. You ✉, "Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference", preprint. [OpenPrint] [code]
L. You ✉, L. Cao, and I. Gurevych, "Preventing the Collapse of Peer Review Requires Verification-First AI", preprint. [OpenPrint]
C. Yu, T. Shi, V. Uotila, S. Deng, L. You, and B. Zhao, "Vista: Verifier-in-the-Loop Agentic RL for Semantic Program Synthesis in Quantum Computing", ACM Conference on AI and Agentic Systems (CAIS) 2026. [link]
This line develops counterfactual explanation methods that respect data geometry, distributional structure, and fairness constraints. It provides the methodological basis for my broader programme: counterfactual claims become scientifically useful only when the admissible transformation and its cost are made explicit.
L. You ✉, Y. Bian, and L. Cao, "Joint Distribution–Informed Shapley Values for Sparse Counterfactual Explanations", International Conference on Learning Representations (ICLR) 2026 [arXiv] [code] [software].
L. Zhu, Y. Bian, and L. You ✉, "FairSHAP: Preprocessing for Fairness Through Attribution-Based Data Augmentation", International Conference on Artificial Intelligence and Statistics (AISTATS) 2026. [OpenPrint] [code]
Y. Gu, L. Cao, B. Zhao, L. Lei, and L. You ✉, "DISCOVER: A Solver for Distributional Counterfactual Explanations", ECML-PKDD 2026. [arXiv] [code]