transformers - ✅(Solved) Fix decode_spans in QA pipeline crashes with ValueError: kth out of bounds when len(scores_flat) == top_k [1 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44327Fetched 2026-04-08 00:29:08
View on GitHub
Comments
2
Participants
3
Timeline
9
Reactions
0
Timeline (top)
referenced ×3commented ×2closed ×1cross-referenced ×1

Error Message

import numpy as np

def decode_spans_bug(seq_len=10, topk=100, max_answer_len=15): """Reproduces the bug when seq_len² == topk""" start = np.random.rand(1, seq_len) end = np.random.rand(1, seq_len) outer = np.matmul(np.expand_dims(start, -1), np.expand_dims(end, 1)) candidates = np.tril(np.triu(outer), max_answer_len - 1) scores_flat = candidates.flatten()

print(f"len(scores_flat)={len(scores_flat)}, topk={topk}")
print(f"len < topk: {len(scores_flat) < topk}")

# This is what decode_spans does:
idx = np.argpartition(-scores_flat, topk)[0:topk]  # CRASH

decode_spans_bug() # ValueError: kth(=100) out of bounds (100)

Fix Action

Fixed

PR fix notes

PR #44584: Fix off-by-one in decode_spans boundary check

Description (problem / solution / changelog)

What does this PR do?

Fixes an off-by-one error in decode_spans() in the document question answering pipeline that causes a ValueError: kth(=N) out of bounds crash when len(scores_flat) == topk.

The boundary check on line 97 uses < but should use <=. When the flattened candidate array has exactly topk elements, np.argpartition(-scores_flat, topk) is called with kth=topk, which is out of bounds (0-indexed). Changing to <= ensures we fall through to argsort in this edge case.

Fixes #44327

Before submitting

Who can review?

@Rocketknight1 (pipelines)

This contribution was developed with AI assistance (Claude Code).

Changed files

  • src/transformers/pipelines/document_question_answering.py (modified, +1/-1)

Code Example

scores_flat = candidates.flatten()
if topk == 1:
    idx_sort = [np.argmax(scores_flat)]
elif len(scores_flat) < topk:          # <-- should be <=
    idx_sort = np.argsort(-scores_flat)
else:
    idx = np.argpartition(-scores_flat, topk)[0:topk]  # crashes when len == topk
    idx_sort = idx[np.argsort(-scores_flat[idx])]

---

import numpy as np

def decode_spans_bug(seq_len=10, topk=100, max_answer_len=15):
    """Reproduces the bug when seq_len² == topk"""
    start = np.random.rand(1, seq_len)
    end = np.random.rand(1, seq_len)
    outer = np.matmul(np.expand_dims(start, -1), np.expand_dims(end, 1))
    candidates = np.tril(np.triu(outer), max_answer_len - 1)
    scores_flat = candidates.flatten()
    
    print(f"len(scores_flat)={len(scores_flat)}, topk={topk}")
    print(f"len < topk: {len(scores_flat) < topk}")
    
    # This is what decode_spans does:
    idx = np.argpartition(-scores_flat, topk)[0:topk]  # CRASH

decode_spans_bug()  # ValueError: kth(=100) out of bounds (100)

---

elif len(scores_flat) <= topk:    # fixed: catches the boundary case
    idx_sort = np.argsort(-scores_flat)
RAW_BUFFERClick to expand / collapse

System Info

  • transformers version: 4.39.0 (also verified present in 4.53.3 and main branch)
  • Python version: 3.10
  • NumPy version: 1.x
  • OS: Linux (AWS SageMaker)

Who can help?

@Narsil

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset

Reproduction

The decode_spans function in src/transformers/pipelines/question_answering.py has an off-by-one boundary check on line 81:

scores_flat = candidates.flatten()
if topk == 1:
    idx_sort = [np.argmax(scores_flat)]
elif len(scores_flat) < topk:          # <-- should be <=
    idx_sort = np.argsort(-scores_flat)
else:
    idx = np.argpartition(-scores_flat, topk)[0:topk]  # crashes when len == topk
    idx_sort = idx[np.argsort(-scores_flat[idx])]

np.argpartition(array, kth) requires 0 <= kth < len(array). When len(scores_flat) == topk, the guard len(scores_flat) < topk evaluates to False, and argpartition is called with kth == len(array), which is out of bounds.

Minimal reproduction:

import numpy as np

def decode_spans_bug(seq_len=10, topk=100, max_answer_len=15):
    """Reproduces the bug when seq_len² == topk"""
    start = np.random.rand(1, seq_len)
    end = np.random.rand(1, seq_len)
    outer = np.matmul(np.expand_dims(start, -1), np.expand_dims(end, 1))
    candidates = np.tril(np.triu(outer), max_answer_len - 1)
    scores_flat = candidates.flatten()
    
    print(f"len(scores_flat)={len(scores_flat)}, topk={topk}")
    print(f"len < topk: {len(scores_flat) < topk}")
    
    # This is what decode_spans does:
    idx = np.argpartition(-scores_flat, topk)[0:topk]  # CRASH

decode_spans_bug()  # ValueError: kth(=100) out of bounds (100)

Since scores_flat has seq_len² elements (from the outer product matrix), this triggers when topk is a perfect square and seq_len = sqrt(topk). For example:

  • top_k=100 crashes when seq_len=10
  • top_k=400 crashes when seq_len=20

This occurs in practice when using top_k > 1 with very short input contexts (e.g., a short question with minimal context text).

Expected behavior

The function should handle the boundary case where len(scores_flat) == topk without crashing. The fix is changing < to <= on line 81:

elif len(scores_flat) <= topk:    # fixed: catches the boundary case
    idx_sort = np.argsort(-scores_flat)

When len(scores_flat) == topk, we want ALL elements anyway, so argsort is both correct and sufficient — no need for argpartition.

extent analysis

Problem Summary

Fix off-by-one boundary check in decode_spans function.

Root Cause Analysis

The decode_spans function has an off-by-one boundary check on line 81, which causes a crash when len(scores_flat) == topk.

Fix Plan

Step 1: Update the boundary check in decode_spans

Change the line elif len(scores_flat) < topk: to elif len(scores_flat) <= topk:.

Step 2: Update the decode_spans function

Replace the line idx = np.argpartition(-scores_flat, topk)[0:topk] with idx = np.argsort(-scores_flat) when len(scores_flat) == topk.

Code Snippet

def decode_spans(seq_len=10, topk=100, max_answer_len=15):
    """Reproduces the bug when seq_len² == topk"""
    start = np.random.rand(1, seq_len)
    end = np.random.rand(1, seq_len)
    outer = np.matmul(np.expand_dims(start, -1), np.expand_dims(end, 1))
    candidates = np.tril(np.triu(outer), max_answer_len - 1)
    scores_flat = candidates.flatten()
    
    print(f"len(scores_flat)={len(scores_flat)}, topk={topk}")
    print(f"len <= topk: {len(scores_flat) <= topk}")
    
    if len(scores_flat) <= topk:
        idx = np.argsort(-scores_flat)
    else:
        idx = np.argpartition(-scores_flat, topk)[0:topk]
    
    return idx

Verification

Run the decode_spans function with topk as a perfect square and seq_len = sqrt(topk) to ensure it handles the boundary case without crashing.

Extra Tips

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The function should handle the boundary case where len(scores_flat) == topk without crashing. The fix is changing < to <= on line 81:

elif len(scores_flat) <= topk:    # fixed: catches the boundary case
    idx_sort = np.argsort(-scores_flat)

When len(scores_flat) == topk, we want ALL elements anyway, so argsort is both correct and sufficient — no need for argpartition.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - ✅(Solved) Fix decode_spans in QA pipeline crashes with ValueError: kth out of bounds when len(scores_flat) == top_k [1 pull requests, 2 comments, 3 participants]