Trendora

MAST

Hold

Tools

A diagnostic framework for analyzing where enterprise agents fail in workflow execution.

Why it's here

Placed in Hold: 1 article(s) of evidence from 1 source(s), led by research-stage coverage, with 0 in the last 30 days. Confidence 24%. Low accumulated evidence, so it defaults conservatively pending more signal.

Evidence (1)

  • 6Hugging Face Blog·2/18/2026research
    IBM and UC Berkeley Study Why Enterprise Agents Fail

    IBM and UC Berkeley present IT-Bench and MAST to evaluate enterprise AI agents and diagnose where they break down in realistic business workflows. The work focuses on identifying failure modes in agent performance rather than introducing a new consumer product.