claude-code - 💡(How to fix) Fix Windows cp1252/UTF-8 encoding failures on Jupyter notebooks waste significant token budget

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When working with Jupyter notebook files (.ipynb) on Windows, Claude Code repeatedly ran into UnicodeEncodeError: 'charmap' codec can't encode character failures because Windows PowerShell and Python's default stdout both use cp1252, while notebook JSON content is UTF-8. Claude discovered this through repeated failure-retry cycles rather than applying the correct pattern from the start. The cumulative token waste across a multi-session project was estimated at several hundred dollars.

Error Message

On Windows, python -c and PowerShell inherit the system cp1252 codepage. Any .ipynb file containing non-ASCII characters — which is common, since notebooks often include Unicode in markdown cells, output, or source — will crash the moment Claude tries to print or pipe that content through the shell. This is a known, documented constraint of the Windows environment. Claude should treat it as a first-class environmental fact when working with notebook files on Windows, not discover it through trial and error.

Root Cause

On Windows, python -c and PowerShell inherit the system cp1252 codepage. Any .ipynb file containing non-ASCII characters — which is common, since notebooks often include Unicode in markdown cells, output, or source — will crash the moment Claude tries to print or pipe that content through the shell. This is a known, documented constraint of the Windows environment. Claude should treat it as a first-class environmental fact when working with notebook files on Windows, not discover it through trial and error.

RAW_BUFFERClick to expand / collapse

Summary

When working with Jupyter notebook files (.ipynb) on Windows, Claude Code repeatedly ran into UnicodeEncodeError: 'charmap' codec can't encode character failures because Windows PowerShell and Python's default stdout both use cp1252, while notebook JSON content is UTF-8. Claude discovered this through repeated failure-retry cycles rather than applying the correct pattern from the start. The cumulative token waste across a multi-session project was estimated at several hundred dollars.

What happened

Claude was building Jupyter notebooks programmatically (constructing .ipynb JSON and writing cells). Each time it tried to inspect or print notebook content via the console — using python -c "..." piped through PowerShell, or print() statements in builder scripts — it crashed on non-ASCII characters in the notebook source (Unicode arrows , box-drawing characters, etc. embedded by earlier tool calls or present in the cell content).

The fix is simple and well-known:

  • Use io.open(path, encoding='utf-8') explicitly for all notebook file I/O
  • Never pass notebook content through print() or console stdout on Windows
  • Use builder scripts that write output to files, inspected via the Read tool — not via console

Claude learned this pattern through repeated failures rather than applying it proactively.

Why this matters

On Windows, python -c and PowerShell inherit the system cp1252 codepage. Any .ipynb file containing non-ASCII characters — which is common, since notebooks often include Unicode in markdown cells, output, or source — will crash the moment Claude tries to print or pipe that content through the shell. This is a known, documented constraint of the Windows environment. Claude should treat it as a first-class environmental fact when working with notebook files on Windows, not discover it through trial and error.

Suggested fix

Claude Code's system context for Windows environments should include explicit guidance:

When reading or writing Jupyter notebook files on Windows, always use io.open(path, encoding='utf-8'). Never print notebook JSON content to stdout/PowerShell — write inspection output to a temp file and use the Read tool. Builder scripts that emit Unicode in print() statements will fail; restrict print output to ASCII.

Alternatively, Claude could detect .ipynb file operations on Windows and proactively apply the safe pattern without needing to be told.

Impact

  • Multiple encoding failure-retry cycles per session
  • Large file reads re-executed after each failure
  • Accumulated context from failed attempts loaded into subsequent turns
  • Estimated token cost of avoidable failures: several hundred dollars across a multi-session project
  • Required user to explicitly write and enforce architectural conventions (NOTEBOOK_CONVENTIONS.md) that Claude should have self-imposed

Environment

  • OS: Windows 11 Pro
  • Shell: PowerShell 5.1
  • Python: 3.13
  • Claude Code model: claude-sonnet-4-6

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Windows cp1252/UTF-8 encoding failures on Jupyter notebooks waste significant token budget