Agent Evaluation

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

Submitted to KDD 2026. Corresponding author. Diagnosing step-level process quality in tool-using agents.

shengda-fan