langchain - ✅(Solved) Fix Feature: Work Ledger integration — regression testing & diff for LangChain runs [1 pull requests, 1 comments, 2 participants]

langchain2026-03-10 21:27:01

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#35725•Fetched 2026-04-08 00:24:55

View on GitHub

Comments

Participants

Timeline

Reactions

Author

metawake

Participants

metawake

miguelmanlyx

Timeline (top)

labeled ×2commented ×1cross-referenced ×1

Fix Action

Fixed

Fixed by PR: docs: add Work Ledger callback handler integration (https://github.com/langchain-ai/docs/pull/3044)

PR fix notes

PR #3044: docs: add Work Ledger callback handler integration

Repository: langchain-ai/docs
Author: metawake
State: open | merged: False
Link: https://github.com/langchain-ai/docs/pull/3044

Description (problem / solution / changelog)

Summary

Adds an integration page for Work Ledger under Python > Integrations > Callbacks.

Work Ledger is an open-source tool (MIT) for recording, diffing, and regression-testing LangChain runs. It ships a WorkLedgerCallbackHandler that inherits from BaseCallbackHandler and captures LLM calls, tool invocations, retriever queries, and chain I/O with token metrics and causal links.

Related feature request: https://github.com/langchain-ai/langchain/issues/35725

Changes

Added src/oss/python/integrations/callbacks/work_ledger.mdx

Notes

docs.json navigation update may be needed — happy to add it if you point me to the right section
The handler has been tested with real OpenAI API calls through langchain-openai
276 tests passing, no heavy dependencies

Changed files

src/oss/python/integrations/callbacks/work_ledger.mdx (added, +97/-0)

Code Example

from work_ledger import WorkLedger, WorkLedgerCallbackHandler
from langchain_openai import ChatOpenAI

ledger = WorkLedger(store="./runs")
handler = WorkLedgerCallbackHandler(ledger, run_name="my-chain")

chain.invoke({"question": "hi"}, config={"callbacks": [handler]})
run = handler.get_run()  # structured Run with steps, metrics, causal links

---

from work_ledger.testing.diff import RunDiff

diff = RunDiff(run_v1, run_v2)
print(f"Similarity: {diff.similarity:.0%}")
print(f"Token delta: {diff.token_diff:+d}")
print(f"Steps added: {diff.steps_added}")

RAW_BUFFERClick to expand / collapse

Checked other resources

This is a feature request, not a bug report or usage question.
I added a clear and descriptive title that summarizes the feature request.
I used the GitHub search to find a similar feature request and didn't find it.
I checked the LangChain documentation and API reference to see if this feature already exists.
This is not related to the langchain-community package.

Package (Required)

langchain-core

Feature Description

Work Ledger is an open-source library (MIT) for recording, replaying, and comparing LLM agent runs. It already ships a WorkLedgerCallbackHandler that inherits from langchain_core.callbacks.BaseCallbackHandler and captures:

LLM/chat model calls with token usage
Tool invocations with inputs/outputs
Retriever queries with documents
Chain-level inputs/outputs
Causal links between steps

from work_ledger import WorkLedger, WorkLedgerCallbackHandler
from langchain_openai import ChatOpenAI

ledger = WorkLedger(store="./runs")
handler = WorkLedgerCallbackHandler(ledger, run_name="my-chain")

chain.invoke({"question": "hi"}, config={"callbacks": [handler]})
run = handler.get_run()  # structured Run with steps, metrics, causal links

After recording runs, you can diff them:

from work_ledger.testing.diff import RunDiff

diff = RunDiff(run_v1, run_v2)
print(f"Similarity: {diff.similarity:.0%}")
print(f"Token delta: {diff.token_diff:+d}")
print(f"Steps added: {diff.steps_added}")

Use Case

When developing LangChain applications, prompt changes, model swaps, or tool modifications can silently alter behavior. Currently there's no standard way to:

Record a chain/agent execution as a structured artifact
Compare two runs to see exactly what changed (steps, outputs, tokens, cost)
Set up golden-file regression tests for CI

Work Ledger fills this gap. It's not an observability platform — it's a testing/debugging tool that complements LangSmith.

Typical workflow:

Record a "known good" run
Make changes (prompt, model, tools)
Record the new run
RunDiff shows exactly what changed
CLI: work-ledger diff <run1> <run2>

Proposed Solution

I'd like to propose adding Work Ledger to LangChain's community integrations documentation, so users can discover it as a testing tool.

The integration is already working — the WorkLedgerCallbackHandler properly inherits from BaseCallbackHandler, passes isinstance checks, and works with LCEL chains via RunnableConfig.

Tested with real OpenAI API calls through langchain-openai:

Simple LLM calls ✓
LCEL chains (prompt | llm | parser) ✓
Tool-calling LLMs ✓
Cross-run diff ✓

Alternatives Considered

LangSmith: Great for observability/monitoring but focuses on tracing, not structured regression testing with diffs
Manual scripts: Every team builds their own — no standard approach
agent-vcr / agentgraph: Similar tools but single-framework; Work Ledger supports LangChain, LangGraph, PydanticAI, CrewAI, LlamaIndex, OpenAI SDK, Anthropic SDK

Additional Context

Repository: https://github.com/metawake/work-ledger
License: MIT
276 tests passing
Also integrates with LangGraph via wrap_graph()
No heavy dependencies — langchain-core is optional (handler falls back to plain class)

extent analysis

Solution Plan

Add Work Ledger to LangChain's Community Integrations Documentation

To integrate Work Ledger with LangChain, follow these steps:

Update LangChain Documentation:
- Create a new section in the community integrations documentation for Work Ledger.
- Include a brief description, installation instructions, and example usage.
Add Work Ledger to LangChain's CI/CD Pipeline:
- Integrate Work Ledger with LangChain's CI/CD pipeline to ensure seamless testing and validation.
Update LangChain's API Reference:
- Add Work Ledger to the list of supported callback handlers in the API reference.
Example Usage:

from langchain_core import BaseCallbackHandler from work_ledger import WorkLedgerCallbackHandler

Create a Work Ledger instance

ledger = WorkLedger(store="./runs")

Create a Work Ledger callback handler

handler = WorkLedgerCallbackHandler(ledger, run_name="my-chain")

Use the callback handler in your LangChain chain

chain.invoke({"question": "hi"}, config={"callbacks": [handler]})


5. **Verify the Integration**:
   - Test the integration with various LangChain chains and callback handlers.
   - Ensure that the Work Ledger callback handler properly captures and records chain execution data.

### Verification

To verify that the integration is working correctly:

1. **Run a LangChain Chain with Work Ledger**:
   - Create a LangChain chain with the Work Ledger callback handler.
   - Run the chain and verify that the Work Ledger instance is properly capturing and recording chain execution data.

2. **Compare Runs with RunDiff**:
   - Record two runs with different inputs or configurations.
   - Use the `RunDiff` class to compare the two runs and verify that the differences are correctly reported.

### Extra Tips

- Make sure to update the Work Ledger documentation to reflect the new integration with LangChain.
- Consider adding

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #agent execution #network issue #logging issue #authentication issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

langchain - ✅(Solved) Fix Feature: Work Ledger integration — regression testing & diff for LangChain runs [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #3044: docs: add Work Ledger callback handler integration

Description (problem / solution / changelog)

Summary

Changes

Notes

Changed files

Code Example

Checked other resources

Package (Required)

Feature Description

Use Case

Proposed Solution

Alternatives Considered

Additional Context

extent analysis

Solution Plan

Add Work Ledger to LangChain's Community Integrations Documentation

Create a Work Ledger instance

Create a Work Ledger callback handler

Use the callback handler in your LangChain chain

Still need to ship something?

TRENDING

langchain - ✅(Solved) Fix Feature: Work Ledger integration — regression testing & diff for LangChain runs [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #3044: docs: add Work Ledger callback handler integration

Description (problem / solution / changelog)

Summary

Changes

Notes

Changed files

Code Example

Checked other resources

Package (Required)

Feature Description

Use Case

Proposed Solution

Alternatives Considered

Additional Context

extent analysis

Solution Plan

Add Work Ledger to LangChain's Community Integrations Documentation

Create a Work Ledger instance

Create a Work Ledger callback handler

Use the callback handler in your LangChain chain

Still need to ship something?

RELATED_DISCOVERY

TRENDING