llamaIndex - 💡(How to fix) Fix [Bug]: LlamaCloud `POST /api/v2/extract` — 422 with documented field names, 500 with actual field names [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21093Fetched 2026-04-08 01:03:53
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Timeline (top)
labeled ×2closed ×1commented ×1

Error Message

  1. Using the field names the validation error suggests (document_input_type/document_input_value), validation passes but the server returns 500.

Code Example

curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "url",
    "value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

---

curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_input_type": "url",
    "document_input_value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

---

# Step 1: documented field names → 422
{"detail":[{"type":"missing","loc":["body","document_input_value"],"msg":"Field required"}]}

# Step 2: actual field names → 500
{"detail":"Oops! Something went wrong on our end. Please try again in a few minutes. If the problem persists, please contact support by clicking the chat icon on cloud.llamaindex.ai providing this correlation ID: 25ebce7c-4616-4295-b692-541659f592bd"}

# Additional correlation IDs:
# 4e32618c-bb6f-4310-916e-67ecdcfb4585
# c4b548bb-bf64-44ac-85ae-9988da9c9bdb
RAW_BUFFERClick to expand / collapse

Bug Description

The v2 extract endpoint (POST /api/v2/extract) has two issues:

  1. The API reference documents type/value as the request body fields, but the API rejects them with 422, asking for document_input_value instead.

  2. Using the field names the validation error suggests (document_input_type/document_input_value), validation passes but the server returns 500.

The v1 flow (file upload + create agent + create job via /api/v1/extraction/jobs) works fine with the same documents and schemas.

Version

LlamaCloud REST API (not using the Python SDK). Endpoint: POST https://api.cloud.llamaindex.ai/api/v2/extract

Steps to Reproduce

  1. Send a POST to /api/v2/extract using the documented field names (type/value):
curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "url",
    "value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

→ Returns 422: "Field required" for document_input_value

  1. Retry with the field names the 422 suggests (document_input_type/document_input_value):
curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_input_type": "url",
    "document_input_value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

→ Returns 500

Tested with minimal schemas, with and without extraction_target, tier, system_prompt — same result every time.

Relevant Logs/Tracbacks

# Step 1: documented field names → 422
{"detail":[{"type":"missing","loc":["body","document_input_value"],"msg":"Field required"}]}

# Step 2: actual field names → 500
{"detail":"Oops! Something went wrong on our end. Please try again in a few minutes. If the problem persists, please contact support by clicking the chat icon on cloud.llamaindex.ai providing this correlation ID: 25ebce7c-4616-4295-b692-541659f592bd"}

# Additional correlation IDs:
# 4e32618c-bb6f-4310-916e-67ecdcfb4585
# c4b548bb-bf64-44ac-85ae-9988da9c9bdb

extent analysis

Fix Plan

To fix the issue, update the request body to use the correct field names (document_input_type and document_input_value) and ensure the config object is properly formatted.

  • Update the curl command to:
curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_input_type": "url",
    "document_input_value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

However, since this returns a 500 error, it's likely that there's an issue with the server-side processing of the request.

  • Verify that the data_schema object is correctly formatted and that the extract_options object is properly configured.
  • If the issue persists, try removing the config object or simplifying the data_schema to see if that resolves the issue.

Verification

To verify that the fix worked, send the updated curl command and check the response. If the response is successful, it should return a 200 status code with the extracted data.

Extra Tips

  • Make sure to check the API documentation for any updates or changes to the request body format.
  • If the issue persists, try contacting the API support team for further assistance, providing the correlation IDs for reference.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING