Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training
Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning.
Academic or research source. Check the methodology, sample size, and whether it's been replicated.
Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning.
TLDR
Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning.