llamaIndex - ✅(Solved) Fix [Bug]: tier="agentic" produces inconsistent/unpredictable parse latency [1 pull requests, 1 comments, 2 participants]

llamaIndex2026-03-02 10:14:04

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#20845•Fetched 2026-04-08 00:30:42

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ritwik-yadav-nam

Participants

logan-markewich

ritwik-yadav-nam

Timeline (top)

labeled ×2closed ×1commented ×1cross-referenced ×1

Root Cause

The problem specifically affects production pipelines where multiple documents are parsed in parallel (using asyncio), because the inconsistent timing in tier="agentic" makes SLA guarantees and timeout tuning impossible and the parsing time isnt acceptable by the users.

Fix Action

Fixed

Fixed by PR: docs(llama-parse): add timeout guidance for agentic tier in production (https://github.com/run-llama/llama_index/pull/20947)

PR fix notes

PR #20947: docs(llama-parse): add timeout guidance for agentic tier in production

Repository: run-llama/llama_index
Author: s-zx
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/20947

Description (problem / solution / changelog)

Summary

When using tier="agentic", parse latency can be variable. Document asyncio.wait_for workaround for production pipelines.

Fix

Add README section showing how to wrap async parse calls with asyncio.wait_for for SLA/timeout tuning.

Fixes #20845

Changed files

llama-index-integrations/readers/llama-index-readers-llama-parse/README.md (modified, +24/-0)

RAW_BUFFERClick to expand / collapse

Bug Description

When using tier="agentic" to parse PDF documents, execution time is both slow and non-deterministic — the same document parsed multiple times returns wildly different durations with no clear pattern. After switching to parse_mode="parse_page_with_llm", parsing speed became consistently fast and predictable with tradeoff .

Version

llama-parse==0.6.92

Steps to Reproduce

Initialize LlamaParse with tier="agentic" and parse the same multi-page PDF several times: from llama_parse import LlamaParse

parser = LlamaParse( api_key="<your_key>", result_type="markdown", num_workers=8, verbose=True, tier="agentic", # <-- the problematic config version="latest", ) 2. Replace tier="agentic" with parse_mode="parse_page_with_llm" and repeat:

parser = LlamaParse( api_key="<your_key>", result_type="markdown", num_workers=8, verbose=True, parse_mode="parse_page_with_llm", # <-- replacement version="latest", )

Relevant Logs/Tracbacks

extent analysis

Fix Plan

1. Upgrade to a newer version of llama-parse

The current version 0.6.92 is outdated. Upgrade to the latest version to ensure you have the latest bug fixes and performance improvements.

2. Switch to parse_mode="parse_page_with_llm"

Replace tier="agentic" with parse_mode="parse_page_with_llm" in your LlamaParse initialization:

from llama_parse import LlamaParse

parser = LlamaParse(
    api_key="<your_key>",
    result_type="markdown",
    num_workers=8,
    verbose=True,
    parse_mode="parse_page_with_llm",  # Replace "agentic" with "parse_page_with_llm"
    version="latest",
)

3. Use asyncio with timeout

To ensure SLA guarantees and timeout tuning, use asyncio with a timeout:

import asyncio

async def parse_document(parser, document):
    try:
        result = await parser.parse(document, timeout=30)  # Set a 30-second timeout
        return result
    except asyncio.TimeoutError:
        print("Timeout exceeded")

async def main():
    parser = LlamaParse(
        api_key="<your_key>",
        result_type="markdown",
        num_workers=8,
        verbose=True,
        parse_mode="parse_page_with_llm",
        version="latest",
    )
    documents = [...]  # List of documents to parse
    tasks = [parse_document(parser, document) for document in documents]
    results = await asyncio.gather(*tasks, return_exceptions=True)
    for result in results:
        if isinstance(result, Exception):
            print(f"Error: {result}")

asyncio.run(main())

Verification

Monitor the parsing time and ensure it's consistently fast and predictable.
Verify that the SLA guarantees and timeout tuning are working as expected.

**Extra Tips

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #runtime error #dependency conflict #environment setup #docker error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

llamaIndex - ✅(Solved) Fix [Bug]: tier="agentic" produces inconsistent/unpredictable parse latency [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #20947: docs(llama-parse): add timeout guidance for agentic tier in production

Description (problem / solution / changelog)

Summary

Fix

Changed files

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

1. Upgrade to a newer version of llama-parse

2. Switch to parse_mode="parse_page_with_llm"

3. Use asyncio with timeout

Verification

Still need to ship something?

TRENDING

llamaIndex - ✅(Solved) Fix [Bug]: tier="agentic" produces inconsistent/unpredictable parse latency [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #20947: docs(llama-parse): add timeout guidance for agentic tier in production

Description (problem / solution / changelog)

Summary

Fix

Changed files

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

1. Upgrade to a newer version of llama-parse

2. Switch to parse_mode="parse_page_with_llm"

3. Use asyncio with timeout

Verification

Still need to ship something?

RELATED_DISCOVERY

TRENDING