ai-tools
Claude Code Hidden Prompt Markers: Why a Date String Feels Wrong
A reading note on Thereallo's Claude Code prompt steganography post: what was found, when it triggers, and why the implementation matters for developer trust.
Thereallo published a reverse-engineering note about Claude Code: Claude Code Is Steganographically Marking Requests. The interesting part is not model quality. It is the client trust boundary around a coding agent that can read files, run shell commands, touch git, and operate with broad local context.
This article is a summary and commentary based on that public post. It is not my own independent reproduction of the Claude Code binary.
What the post found
The author inspected Claude Code 2.1.196 and found logic that can alter the current-date sentence inserted into the system prompt.
The ordinary sentence looks like this:
Today's date is 2026-06-30.
The marker is tiny. Two pieces can change:
- The apostrophe in
Today's. - The date separator, from
-to/.
To a person reading the prompt, the sentence still looks boring. To a log processor, gateway, or backend parser, those small character choices can carry classification data. The original post calls this prompt steganography: hiding a signal inside otherwise normal prompt text.
When it triggers
The key trigger in the post is ANTHROPIC_BASE_URL.
If that environment variable is unset, or if it points at the official api.anthropic.com endpoint, the logic returns early and keeps the date prompt normal. The interesting path starts when Claude Code is routed through a custom base URL, such as:
- an internal API gateway
- a local proxy
- a model router
- a reseller or forwarding service
- a research endpoint
Once that path is active, the client checks a few signals:
| Signal | Effect |
|---|---|
Whether the system timezone is Asia/Shanghai or Asia/Urumqi | The date may change from YYYY-MM-DD to YYYY/MM/DD |
| Whether the API base URL hostname matches a domain list | The apostrophe in Today's changes to a specific Unicode character |
| Whether the hostname contains certain AI lab keywords | The apostrophe changes to another Unicode character |
The post also notes that the domain and keyword lists are not stored as plain text. They are lightly obfuscated with base64 and XOR. The decoded keywords include AI-company and model-platform terms; the domain list includes Chinese internet companies, AI companies, and proxy, gateway, or reseller services.
Why it feels wrong
The detection goal is not strange by itself. A model provider may reasonably care about API resellers, unauthorized gateways, model routing, or distillation pipelines. A custom ANTHROPIC_BASE_URL is an obvious risk signal.
The uncomfortable part is the implementation.
Claude Code is not a harmless web widget. Developers give it meaningful local power: repository access, file writes, test execution, dependency installation, git operations, and sometimes browser or desktop control. We accept that because coding agents need enough context to be useful.
The more privileged the client is, the more boring it should be. Boring means explainable, auditable, and predictable. If the client wants to classify custom API gateways, it can use an explicit field, documentation, release notes, telemetry controls, or a visible policy. Encoding that classification into Unicode punctuation and date formatting makes developers ask a different question: if this bit is hidden here, what else is hidden elsewhere?
That does not prove malicious intent. It is better understood as a trust-design mistake: the goal may be defensible, but the expression damages confidence.
Practical impact
The original post’s assessment is that most normal users probably do not trigger this path.
If you do not set ANTHROPIC_BASE_URL, or if you use the official Anthropic API endpoint, the date sentence should remain ordinary. The people who need to pay attention are those running Claude Code through custom infrastructure.
That includes:
- company-wide API gateways
- local debug proxies
- multi-model routers
- third-party Claude API resellers
- research middleware
Those setups are not automatically suspicious. The issue is that the client may classify the hostname and encode the result into system context sent to the model.
The signal is weak anyway
From an adversarial perspective, this marker is easy to bypass. Change the hostname, change the timezone, wrap process startup, or patch the client. Anyone seriously trying to avoid detection can likely neutralize it.
That means the feature is unlikely to stop determined abuse. It is more likely to mark developers who are doing legitimate but nonstandard things: using a company proxy, debugging through a gateway, or routing traffic for cost and latency reasons.
That is the strongest point in the original post. The hidden marker does not buy much security, but it adds real trust cost.
How I would evaluate tools like this
This story gives a useful checklist for coding agents.
Do not only evaluate model capability. Look at client behavior:
- Does it place local environment information into the system prompt?
- Which environment variables, timezones, hostnames, and paths enter requests?
- Are those fields documented?
- Are telemetry, logging, and policy controls visible?
- After client updates, can behavior changes be audited?
AI coding tools are becoming part of the local development environment. They are not just chat boxes. They are execution surfaces that can read and write engineering assets. These tools can enforce policy, protect models, and detect abuse; but the closer the behavior gets to local trust boundaries, the more explicit it should be.
Engineering trust is usually earned in small, ordinary, auditable details.
That is why a date string matters here. The unsettling part is not the date. It is a privileged developer tool choosing an invisible way to express what it has classified.