Research - Ulam | DeepAlgebra Project & AI Research

Our research focuses on machine reasoning and artificial general intelligence, with mathematics as our primary test case for abstract thinking capabilities.

AI for Mathematics

We treat mathematics as the next game at which machines would excel, serving as a test case for machine reasoning capabilities.

Key Focus: Building autonomous mathematician working 24/7.

▶ DeepAlgebra - an outline of a program, 2016
▶ Mathematics in the Age of Large Language Models, 2025
▶ Open Mathematical Problems as an AI Reasoning Benchmark, 2026
▶ UlamAI Prover: An Open-Source Lean 4 Theorem Prover and Formalizer, 2026
▶ Lean for Science Formalization, 2026

AI-powered Mathematics

Research level mathematics we have done while testing reasoning capabilities of various LLMs.

Key Focus: Proving research level mathematics with AI. Benchmarking frontier models.

▶ Erdos Problem 1148, full solution (GPT-5.4 Pro), March 2026, formalization in Lean with UlamAI Prover.
▶ Erdos Problem 258, full solution (GPT-5.4 Pro), April 2026, formalization in Lean
▶ FrontierMath Open Problem - del Pezzo surfaces in char 3, in revision (multi-agent harness powered by GPT-5.4 Pro), tex source, erratas, March 2026

▶ FrontierMath Open Problem - Diophantine Equations, partial results (Opus 4.6), March 2026
▶ FrontierMath Open Problem - Hadamard 668, partial results (Opus 4.6), Github repository, March 2026
▶ Erdos Problem 1183, partial results (GPT-5.4 Pro), March 2026
▶ Erdos Problem 783, partial results (GPT-5.2), January 2026

Machine Reasoning Theory

Research on reasoning agents to find the next frontier of large language models.

Key Focus: Mathematical frameworks for human-like reasoning and intelligence.

▶ The Geometry of Benchmarks: A New Path Toward AGI, 2025
▶ An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence, 2025
▶ Psychometric Tests for AI Agents and Their Moduli Space, 2025
▶ Self-Improving AI Agents through Self-Play, 2025
▶ Mathematics and Coding are Universal AI Benchmarks, 2025
▶ AI Langlands Program for Moduli of Neural Networks, in preparation (on demand)