RESEARCH

MemTrace: Probing What Final Accuracy Misses in Long-Term Memory

ArXiv cs.AI · Wed, 17 Jun 2026 04:00:00 GMT

arXiv:2606.17328v1 Announce Type: new Abstract: LLM agents increasingly maintain long-term memory of user facts across sessions. Yet such memory is usually evaluated by aggregating accuracy over question rows or episodes. Because this approach scores question rows independently,

Read original source Discuss with A.S.I.S