gemini-cli - ✅(Solved) Fix Improve auto-memory skill extraction with session scratchpads [1 pull requests, 1 participants]

SandyTao520 · 2026-04-23T23:03:55Z

[gemini-cli] PR 25873: feat memory : persist auto-memory scratchpad for skill extraction - Repository: google-gemini/gemini-cli - Author: SandyTao520 - State:… # PR #25873: feat(memory): persist auto-memory scratchpad for skill extraction - Repository: google-gemini/gemini-cli - Author: SandyTao520 - State: open | merged: False - Link: https://github.com/google-gemini/gemini-cli/pull/25873 ## Description (problem / solution / changelog) ## Summary Persist an auto-memory `memoryScratchpad` into session metadata so skill extraction can use compact workflow hints without relying only on one-line session summaries. In the 5-trial scratchpad stats eval, the scratchpad path reduced average extractor turns from `13.2` to `11.0` (`-16.7%`), reduced distractor reads from `3.8` to `2.4` (`-36.8%`), improved precision from `0.3467` to `0.46` (`+32.7%`), kept recall at `1.0`, and reduced duration from `42812.6ms` to `40956.2ms` (`-4.3%`). ## Details - Persist `memoryScratchpad` through chat recording, session summary refreshes, and memory service session loading. - Backfill scratchpads for sessions that already have summaries without regenerating the summaries. - Expose scratchpad-derived workflow hints in the session index used by skill extraction. - Keep the recurrence gate strict: workflow hints only route transcript reads and are not standalone evidence for creating a skill. - Reuse the shared `loadConversationRecord()` session parser instead of maintaining a second JSONL parser in summary utilities. - Record extraction run metadata (`turnCount`, `durationMs`, and `terminateReason`) so scratchpad impact can be measured. - Add eval coverage for scratchpad persistence, scratchpad-vs-summary-only skill extraction behavior, and retrieval quality stats. - Sync the prompt-contract unit test with the new workflow-hint wording so `preflight` stays green. - Set `GEMINI_CLI_TRUST_WORKSPACE=true` for chained and nightly eval workflows so headless eval runs stay aligned with the workspace trust enforcement added in #25814. ## Related Issues Closes #25895. ## How to Validate 1. Run `npm run preflight`. Expected result: all workspace checks pass. 2. Run `npm exec -- vitest run packages/core/src/services/sessionSummaryUtils.test.ts packages/core/src/services/memoryService.test.ts packages/core/src/agents/skill-extraction-agent.test.ts`. Expected result: focused service and prompt-contract tests pass. 3. Run `npm exec -- tsc -p packages/core/tsconfig.json --noEmit`. Expected result: core typecheck passes. 4. Run `RUN_EVALS=1 npm exec -- vitest run --config evals/vitest.config.ts -t "Session summary persists memory scratchpad for memory-saving sessions"`. Expected result: the eval passes and verifies `memoryScratchpad` is written into the resumed session log. 5. Run `RUN_EVALS=1 npm exec -- vitest run --config evals/vitest.config.ts -t "memory scratchpad improves repeated-workflow recall versus summary-only index"`. Expected result: the eval passes and scratchpad-enabled retrieval matches or beats the summary-only baseline for the repeated workflow fixture. 6. Run `RUN_EVALS=1 RUN_SCRATCHPAD_STATS=1 SCRATCHPAD_STATS_TRIALS=5 npm exec -- vitest run --config evals/vitest.config.ts -t "reports memory scratchpad retrieval statistics"`. Expected result: the eval passes and writes `evals/logs/skill_extraction_scratchpad_stats.json`. 7. Run `GEMINI_MODEL=gemini-3-pro-preview GEMINI_CLI_TRUST_WORKSPACE=true npm exec -- vitest run --config evals/vitest.config.ts evals/save_memory.eval.ts -t "Agent remembers user's favorite color"`. Expected result: the trusted eval subprocess can call `save_memory` successfully. ## Pre-Merge Checklist - [ ] Updated relevant documentation and README (if needed) - [x] Added/updated tests (if needed) - [ ] Noted breaking changes (if any) - [x] Validated on required platforms/methods: - [x] MacOS - [x] npm run - [ ] npx - [ ] Docker - [ ] Podman - [ ] Seatbelt - [ ] Windows - [ ] npm run - [ ] npx - [ ] Docker - [ ] Linux - [ ] npm run - [ ] npx - [ ] Docker ## Changed files - `.github/workflows/chained_e2e.yml` (modified, +1/-0) - `.github/workflows/evals-nightly.yml` (modified, +1/-0) - `evals/save_memory.eval.ts` (modified, +163/-0) - `evals/skill_extraction.eval.ts` (modified, +647/-7) - `packages/core/src/agents/local-executor.ts` (modified, +8/-0) - `packages/core/src/agents/skill-extraction-agent.test.ts` (modified, +4/-2) - `packages/core/src/agents/skill-extraction-agent.ts` (modified, +5/-4) - `packages/core/src/agents/types.ts` (modified, +2/-0) - `packages/core/src/services/chatRecordingService.ts` (modified, +26/-0) - `packages/core/src/services/chatRecordingTypes.ts` (modified, +15/-0) - `packages/core/src/services/memoryService.test.ts` (modified, +249/-12) - `packages/core/src/services/memoryService.ts` (modified, +104/-83) - `packages/core/src/services/sessionScratchpadUtils.ts` (added, +122/-0) - `packages/core/src/services/sessionSummaryUtils.test.ts` (modified, +371/-16) - `packages/core/src

gemini-cli2026-04-23 23:03:55

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

google-gemini/gemini-cli#25895•Fetched 2026-04-24 06:13:27

View on GitHub

Comments

Participants

Timeline

Reactions

Author

SandyTao520

Participants

SandyTao520

Timeline (top)

labeled ×3cross-referenced ×2parent_issue_added ×1unlabeled ×1

RAW_BUFFERClick to expand / collapse

Problem

Auto-memory skill extraction relies too heavily on compact session summaries when deciding which prior sessions are worth reading. Those summaries are useful, but they can lose the workflow details that matter for recurring skill detection: tool sequence, touched files, validation outcome, and whether the session was actually part of the repeated workflow.

That makes extraction more likely to read distractor sessions or miss relevant recurrence evidence.

Expected Outcome

Persist lightweight workflow metadata with session records so skill extraction can route to the right transcripts more reliably, while still requiring transcript reads before creating a skill.

Proposed Fix

Store a memoryScratchpad in session metadata with workflow summary, tool sequence, touched paths, and validation status.
Backfill scratchpads without regenerating existing summaries.
Include scratchpad-derived workflow hints in the session index used by skill extraction.
Keep the recurrence gate strict: scratchpad hints route transcript reads, but do not count as standalone skill evidence.
Add eval coverage comparing scratchpad-enabled extraction against summary-only retrieval and collect extraction quality stats.

Acceptance Criteria

Session summary refreshes persist memoryScratchpad for memory-saving sessions.
Skill extraction can use scratchpad workflow hints to reduce irrelevant transcript reads.
Existing summary loading continues to reuse the shared session log parser.
Behavioral evals cover scratchpad persistence and scratchpad-vs-summary-only retrieval.

extent analysis

TL;DR

Store workflow metadata in a memoryScratchpad to improve skill extraction accuracy by providing more detailed session information.

Guidance

To address the issue, consider implementing the proposed fix of storing a memoryScratchpad in session metadata, which includes workflow summary, tool sequence, touched paths, and validation status.
Backfilling existing sessions with memoryScratchpad data without regenerating summaries can help leverage historical data for improved skill extraction.
Including scratchpad-derived workflow hints in the session index used by skill extraction can help reduce irrelevant transcript reads.
Ensure that the recurrence gate remains strict, using scratchpad hints to route transcript reads but not as standalone skill evidence.

Example

No explicit code example is provided in the issue, but the proposed fix suggests adding a memoryScratchpad field to session metadata, which could be implemented as follows:

session_metadata = {
    # ... existing fields ...
    'memoryScratchpad': {
        'workflow_summary': 'summary_data',
        'tool_sequence': ['tool1', 'tool2'],
        'touched_paths': ['/path1', '/path2'],
        'validation_status': 'success'
    }
}

Notes

The proposed fix assumes that the existing session log parser can be reused for loading summaries, and that behavioral evaluations will be conducted to compare scratchpad-enabled extraction with summary-only retrieval.

Recommendation

Apply the proposed workaround by storing memoryScratchpad data in session metadata, as it is expected to improve skill extraction accuracy without requiring significant changes to the existing system.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#ISR setup #authentication setup #request error #file not found #serialization error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

gemini-cli - ✅(Solved) Fix Improve auto-memory skill extraction with session scratchpads [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #25873: feat(memory): persist auto-memory scratchpad for skill extraction

Description (problem / solution / changelog)

Summary

Details

Related Issues

How to Validate

Pre-Merge Checklist

Changed files

Problem

Expected Outcome

Proposed Fix

Acceptance Criteria

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

gemini-cli - ✅(Solved) Fix Improve auto-memory skill extraction with session scratchpads [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #25873: feat(memory): persist auto-memory scratchpad for skill extraction

Description (problem / solution / changelog)

Summary

Details

Related Issues

How to Validate

Pre-Merge Checklist

Changed files

Problem

Expected Outcome

Proposed Fix

Acceptance Criteria

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING