RESEARCH

CrowdMath: A Dataset of Crowdsourced Mathematical Research Discussions

ArXiv cs.AI · Mon, 08 Jun 2026 04:00:00 GMT

arXiv:2606.06526v1 Announce Type: new Abstract: Large language models have made substantial progress on mathematical reasoning, but existing benchmarks typically evaluate well-specified problems with final answers, step-by-step solutions, or complete proofs. They do not capture c

Read original source Discuss with A.S.I.S