Our research focuses on machine reasoning and artificial general intelligence, with mathematics as our primary test case for abstract thinking capabilities.
AI for Mathematics
We treat mathematics as the next game at which machines would excel, serving as a test case for machine reasoning capabilities.
Key Focus: Building autonomous mathematician working 24/7.
▶ DeepAlgebra - an outline of a program, 2016▶ Mathematics in the Age of Large Language Models, 2025
▶ Open Mathematical Problems as an AI Reasoning Benchmark, 2026
▶ UlamAI Prover: An Open-Source Lean 4 Theorem Prover and Formalizer, 2026
▶ Lean for Science Formalization, 2026
AI-powered Mathematics
Research level mathematics we have done while testing reasoning capabilities of various LLMs.
Key Focus: Proving research level mathematics with AI. Benchmarking frontier models.
▶ Erdos Problem 1148, full solution (GPT-5.4 Pro), March 2026, formalization in Lean with UlamAI Prover.▶ FrontierMath Open Problem - Diophantine Equations, partial results (Opus 4.6), March 2026
▶ FrontierMath Open Problem - Hadamard 668, partial results (Opus 4.6), Github repository, March 2026
▶ Erdos Problem 1183, partial results (GPT-5.4 Pro), March 2026
▶ Erdos Problem 783, partial results (GPT-5.2), January 2026
Machine Reasoning Theory
Research on reasoning agents to find the next frontier of large language models.
Key Focus: Mathematical frameworks for human-like reasoning and intelligence.
▶ The Geometry of Benchmarks: A New Path Toward AGI, 2025▶ An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence, 2025
▶ Psychometric Tests for AI Agents and Their Moduli Space, 2025
▶ Self-Improving AI Agents through Self-Play, 2025
▶ Mathematics and Coding are Universal AI Benchmarks, 2025
▶ AI Langlands Program for Moduli of Neural Networks, in preparation (on demand)
