llamaIndex - 💡(How to fix) Fix [Bug]: LlamaCloud `POST /api/v2/extract` — 422 with documented field names, 500 with actual field names [1 comments, 2 participants]

llamaIndex2026-03-20 13:03:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#21093•Fetched 2026-04-08 01:03:53

View on GitHub

Comments

Participants

Timeline

Reactions

Author

romaincointepas

Participants

logan-markewich

romaincointepas

Timeline (top)

labeled ×2closed ×1commented ×1

Error Message

Using the field names the validation error suggests (document_input_type/document_input_value), validation passes but the server returns 500.

Code Example

curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "url",
    "value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

---

curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_input_type": "url",
    "document_input_value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

---

# Step 1: documented field names → 422
{"detail":[{"type":"missing","loc":["body","document_input_value"],"msg":"Field required"}]}

# Step 2: actual field names → 500
{"detail":"Oops! Something went wrong on our end. Please try again in a few minutes. If the problem persists, please contact support by clicking the chat icon on cloud.llamaindex.ai providing this correlation ID: 25ebce7c-4616-4295-b692-541659f592bd"}

# Additional correlation IDs:
# 4e32618c-bb6f-4310-916e-67ecdcfb4585
# c4b548bb-bf64-44ac-85ae-9988da9c9bdb

RAW_BUFFERClick to expand / collapse

Bug Description

The v2 extract endpoint (POST /api/v2/extract) has two issues:

The API reference documents type/value as the request body fields, but the API rejects them with 422, asking for document_input_value instead.
Using the field names the validation error suggests (document_input_type/document_input_value), validation passes but the server returns 500.

The v1 flow (file upload + create agent + create job via /api/v1/extraction/jobs) works fine with the same documents and schemas.

Version

LlamaCloud REST API (not using the Python SDK). Endpoint: POST https://api.cloud.llamaindex.ai/api/v2/extract

Steps to Reproduce

Send a POST to /api/v2/extract using the documented field names (type/value):

curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "url",
    "value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

→ Returns 422: "Field required" for document_input_value

Retry with the field names the 422 suggests (document_input_type/document_input_value):

curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_input_type": "url",
    "document_input_value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

→ Returns 500

Tested with minimal schemas, with and without extraction_target, tier, system_prompt — same result every time.

Relevant Logs/Tracbacks

# Step 1: documented field names → 422
{"detail":[{"type":"missing","loc":["body","document_input_value"],"msg":"Field required"}]}

# Step 2: actual field names → 500
{"detail":"Oops! Something went wrong on our end. Please try again in a few minutes. If the problem persists, please contact support by clicking the chat icon on cloud.llamaindex.ai providing this correlation ID: 25ebce7c-4616-4295-b692-541659f592bd"}

# Additional correlation IDs:
# 4e32618c-bb6f-4310-916e-67ecdcfb4585
# c4b548bb-bf64-44ac-85ae-9988da9c9bdb

extent analysis

Fix Plan

To fix the issue, update the request body to use the correct field names (document_input_type and document_input_value) and ensure the config object is properly formatted.

Update the curl command to:

curl -X POST "https://api.cloud.llamaindex.ai/api/v2/extract" \
  -H "Authorization: Bearer $LLAMA_CLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "document_input_type": "url",
    "document_input_value": "https://example.com/document.pdf",
    "config": {
      "extract_options": {
        "data_schema": {
          "type": "object",
          "properties": {
            "invoice_number": {"type": "string"}
          }
        }
      }
    }
  }'

However, since this returns a 500 error, it's likely that there's an issue with the server-side processing of the request.

Verify that the data_schema object is correctly formatted and that the extract_options object is properly configured.
If the issue persists, try removing the config object or simplifying the data_schema to see if that resolves the issue.

Verification

To verify that the fix worked, send the updated curl command and check the response. If the response is successful, it should return a 200 status code with the extracted data.

Extra Tips

Make sure to check the API documentation for any updates or changes to the request body format.
If the issue persists, try contacting the API support team for further assistance, providing the correlation IDs for reference.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

llamaIndex - 💡(How to fix) Fix [Bug]: LlamaCloud `POST /api/v2/extract` — 422 with documented field names, 500 with actual field names [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

llamaIndex - 💡(How to fix) Fix [Bug]: LlamaCloud `POST /api/v2/extract` — 422 with documented field names, 500 with actual field names [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING