llamaIndex - ✅(Solved) Fix [Bug]: documentation override for classes inheriting from MetadataAwareTextSplitter [1 pull requests, 6 comments, 3 participants]

llamaIndex2026-02-15 15:00:04

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#20706•Fetched 2026-04-08 00:31:20

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Assignees

Timeline (top)

commented ×6referenced ×3cross-referenced ×2labeled ×2

Fix Action

Fix / Workaround

i did notice that the documentation for the init method is being overridden for classes inheriting from MetadataAwareTextSplitter, i think we can attribute this fact to DispatcherSpanMixin (not confirmed but i wanted to iterate on my opinion)

I did face this issue and patched it in #20622 by manually setting the docs outside the class definition.

PR fix notes

PR #20622: feat: add chonkie integration

Repository: run-llama/llama_index
Author: chonk-lain
State: closed | merged: True
Link: https://github.com/run-llama/llama_index/pull/20622

Description (problem / solution / changelog)

Description

this will add chonkie integration to the list of llamaindex integrations. how to use:

from llama_index.ingestion.chonkie import Chunker
chunker = Chunker(chunker_type="semantic")
out_list = chunker.split_text(text)

or you can use (following example from llama-index docs )

from llama_index.core import Document
from llama_index.core.ingestion import IngestionPipeline
from llama_index.ingestion.chonkie import Chunker

pipeline = IngestionPipeline(
    transformations=[
        Chunker(chunker_type="recursive", chunk_size=512),
    ]
)
nodes = pipeline.run(documents=[Document.example()])

with reference to llama-index docs

from llama_index.core import VectorStoreIndex, Document, Settings
from llama_index.core.vector_stores import MetadataFilters, ExactMatchFilter
from llama_index.ingestion.chonkie import Chunker

documents = [
    Document(text="text", metadata={"author": "LlamaIndex"}),
    Document(text="text", metadata={"author": "John Doe"}),
]
# use Chunker for chunking
chunker =  Chunker(chunker_type="recursive", chunk_size=512)

index = VectorStoreIndex.from_documents(documents, transformations= [chunker])

Fixes # (issue) None

follow ups

I went back and forth between to put this under ingestion or node-parser, and i have settled on ingestion because i might add other pipeline-related features from our library in the future, let me know if i should switch this to node parser or keep it here
i have tried as much as possible to follow the same schema as other chunkers in the current library so it should in theory integrate seamlessly within the core functionalities of llamaindex
would appreciate it if there are any inputs as to where to docs go

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

I added new unit tests to cover this change
I believe this change is already covered by existing unit tests

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran uv run make format; uv run make lint to appease the lint gods

Changed files

docs/src/content/docs/framework/module_guides/loading/node_parsers/modules.md (modified, +23/-0)
llama-index-integrations/node_parser/llama-index-node-parser-chonkie/.gitignore (added, +207/-0)
llama-index-integrations/node_parser/llama-index-node-parser-chonkie/README.md (added, +73/-0)
llama-index-integrations/node_parser/llama-index-node-parser-chonkie/llama_index/node_parser/chonkie/__init__.py (added, +7/-0)
llama-index-integrations/node_parser/llama-index-node-parser-chonkie/llama_index/node_parser/chonkie/chunkers.py (added, +160/-0)
llama-index-integrations/node_parser/llama-index-node-parser-chonkie/pyproject.toml (added, +72/-0)
llama-index-integrations/node_parser/llama-index-node-parser-chonkie/tests/__init__.py (added, +1/-0)
llama-index-integrations/node_parser/llama-index-node-parser-chonkie/tests/test_chunkers.py (added, +406/-0)

Code Example

check comment  and colab notebook attached above

RAW_BUFFERClick to expand / collapse

Bug Description

following up on https://github.com/run-llama/llama_index/pull/20622#discussion_r2764697454

I did face this issue and patched it in #20622 by manually setting the docs outside the class definition.

TLDR: help(class.__init__) has been overridden.

Version

editable

Steps to Reproduce

for the sake of debugging i created a seperate branch and a minimalistic colab notebook to help with this https://colab.research.google.com/drive/1fIKG6hK09ykuqUY5bhGWK2d90TgsVb3S?usp=sharing

Relevant Logs/Tracbacks

check comment  and colab notebook attached above

extent analysis

Fix Plan

Override Documentation for Inherited Classes

To fix the issue where the documentation for the __init__ method is being overridden for classes inheriting from MetadataAwareTextSplitter, we need to manually set the documentation outside the class definition.

Steps:

Check if the class is inheriting from MetadataAwareTextSplitter:

if issubclass(class_to_check, MetadataAwareTextSplitter): # proceed with the fix

2. **Manually set the documentation for the `__init__` method**:
   ```python
class_to_check.__init__.__doc__ = "Custom documentation for the __init__ method"

Apply the fix to the class definition:

class CustomClass(MetadataAwareTextSplitter): def init(self, *args, **kwargs): super().init(*args, **kwargs) # custom initialization code

   ```python
CustomClass.__init__.__doc__ = "Custom documentation for the __init__ method"

Example Use Case:

class CustomClass(MetadataAwareTextSplitter):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # custom initialization code

CustomClass.__init__.__doc__ = "Custom documentation for the __init__ method"

print(CustomClass.__init__.__doc__)  # Output: Custom documentation for the __init__ method

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #search optimization #API routing #API middleware #SSR setup #ISR setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

llamaIndex - ✅(Solved) Fix [Bug]: documentation override for classes inheriting from MetadataAwareTextSplitter [1 pull requests, 6 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

PR fix notes

PR #20622: feat: add chonkie integration

Description (problem / solution / changelog)

Description

follow ups

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

Override Documentation for Inherited Classes

Steps:

Example Use Case:

Still need to ship something?

TRENDING

llamaIndex - ✅(Solved) Fix [Bug]: documentation override for classes inheriting from MetadataAwareTextSplitter [1 pull requests, 6 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

PR fix notes

PR #20622: feat: add chonkie integration

Description (problem / solution / changelog)

Description

follow ups

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

Override Documentation for Inherited Classes

Steps:

Example Use Case:

Still need to ship something?

RELATED_DISCOVERY

TRENDING