Tari Ibaba is a software developer with years of experience building websites and apps. He has written extensively on a wide range of programming topics and has created dozens of apps and open-source libraries.
The past few days have had so many devs going crazy over Google’s new open-source Gemma 4.
And for very good reason — suddenly so many AI-powered tools like Claude Code have now become FREE and accessible to everyone — without any compromises in intelligence.
And the best part is it’s so ridiculously easy to set up locally — thanks to ingenious connector tools like Ollama.
Gemma 4 + Ollama + Claude Code.
Ollama exposes an Anthropic-compatible API — which allows Claude Code to talk to a local model instead of a hosted endpoint.
With Gemma 4 running locally, you get a Claude-style coding workflow without relying on remote inference.
This gives you:
local coding model
Claude Code terminal workflow
no hosted inference calls
fast iteration
full repo privacy
easy model swapping
What more could you even ask for?
1. Get started: Install and Run Gemma 4 with Ollama
Installing or updating Ollama is just too easy:
curl -fsSL https://ollama.com/install.sh | sh
Then pull a Gemma 4 model based on your hardware:
Model sizes to pick from
E2B
2.3B effective (~5.1B w/ embeddings)
~1.7GB download
~1.5–2GB RAM
ollama pull gemma4:e2b
E4B
4.5B effective (~8B w/ embeddings)
~3.2GB download
~3–4GB RAM
ollama pull gemma4:e4b
26B A4B
26B total (4B active)
~17GB download
~18–20GB RAM
ollama pull gemma4:26b
31B Dense
31B
~19GB download
~20-24GB RAM
ollama pull gemma4:31b
Verify the model works:
ollama run gemma4:26b "Hello, can you help me with Python?"
This was definitely one of the most fascinating features that slipped out with the massive Claude Code source code leak.
Type /buddy in Claude Code — and you hatch a tiny ASCII creature that sits beside your prompt while you code. It’s only about five lines tall.
Hatching my so-called buddy in Claude Code:
You’ll see that it’s definitely in a different class from most of the other Claude Code commands…
Petting my so-called buddy in Claude Code… uhmm… okay?
It doesn’t help you write functions. It doesn’t debug your stack traces.
It just… watches.
Many people were saying it was supposed to be some sort of April Fools gimmick…
But the deeper design is revealing something more deliberate that most people aren’t paying attention to:
Claude Code is experimenting with personalization, emotional UX, and proactive AI behavior — disguised as a cute terminal pet.
1. Uniquely yours
When you hatch your buddy, Claude generates:
A unique name
A permanent personality description
Deterministic stats tied to your user identity
That means your buddy isn’t just random fluff. It’s unique to you.
Not just cosmetically. Behaviorally.
This kind of identity-based personalization is rare in developer tools. And intentional. The moment users feel something is “theirs”, attachment forms — even if the feature is technically trivial.
2. 18 species, rarity tiers, and “shinies”
Claude didn’t stop at a single mascot.
Your buddy can be one of 18 different species, including:
Capybara
Axolotl
Ghost
Mushroom
(and more)
There’s also a full rarity system:
Common — 60%
Higher rarity tiers in between
Legendary — 1%
And then there’s the extra layer:
Shiny variant chance — 1%
Different colors. Same species. Much rarer.
This is straight out of collectible game design. And it works. Users compare. Share. React. Suddenly a terminal pet becomes social.
3. Five personality stats
Every buddy rolls five deterministic stats:
Debugging
Patience
Chaos
Wisdom
Snark
These stats shape how it comments on your workflow.
A high-snark buddy might tease your bugs. A high-patience buddy might encourage you. A high-chaos buddy might… not be helpful at all.
It’s lightweight. But it makes the companion feel alive.
4. It actually sits in your workflow
The buddy appears as:
A small ASCII figure
Roughly 5 lines tall
Positioned next to your prompt
Always visible unless hidden
No panel. No window. No popup.
Just a quiet presence.
5. It occasionally talks
If unmuted, the buddy will sometimes:
Comment on your code
React to errors
Tease procrastination
Offer small observations
These show up as speech bubbles.
Short. Infrequent. Character-driven.
This is important.
Because the AI isn’t just responding anymore — it’s initiating.
6. A separate “watcher” entity
Claude is reportedly instructed to treat the buddy as a separate watcher.
That means:
You can talk to the buddy directly
It has its own tone
It doesn’t replace the main assistant
It behaves like a side character
This avoids mixing personalities. Claude stays serious — the buddy stays playful.
Clean separation and UX.
The command set
If you have Pro and the latest Claude Code, you can use:
/buddy — Hatch or show your companion
/buddy card — View stats, rarity, personality
/buddy pet — Small interaction with heart animations
/buddy off — Hide the companion
/buddy mute — Silence commentary
These small rituals matter. They make the feature feel real.
7. Why this actually matters
Don’t think this is just a cute pet. It’s testing two big ideas.
1. Personality can beat raw intelligence
AI tools are starting to compete on feel, not just capability.
A colder tool may be objectively better. But users often prefer the one that feels alive.
We’ve seen this before — when more emotional models like GPT-4o were initially replaced by GPT-5, many users reacted negatively despite the noticeable capability gains. Personality creates attachment.
Buddy leans into that.
2. The real deal — testing proactive AI safely
Anthropic is actually secretly trying to solve a difficult problem:
When should AI interrupt?
Too often → annoying
Never → passive
Somewhere in between → useful
Buddy is a clever workaround.
If a popup interrupts you → frustrating. If a tiny pet interrupts you → charming.
Same behavior. Different perception.
This makes Buddy a Trojan horse for proactive UX testing.
And it goes deeper than just commentary and recommendations.
From what we saw in the Claude Code leak, it looks as if Buddy could act as a permission layer for more proactive systems like KAIROS.
Instead of a sterile confirmation dialog, the pet might interrupt with something like: “I found a way to optimize those 12 functions. Should I go for it?”
That makes high-agency AI behavior feel conversational instead of intrusive.
The two may also share project memory:
KAIROS records what changed in the code
Buddy records what’s going on with you and your workflow
These merge into shared context, so the AI wakes up understanding both:
the codebase state
the developer’s intent
Anthropic can learn:
How often users tolerate interruptions
What tone feels acceptable
When AI initiative becomes annoying
When it feels helpful
All inside a low-stakes, playful wrapper.
Small feature. Big signal.
On paper, Buddy does almost nothing.
No coding help. No automation. No productivity gain.
But it introduces:
Unique identity per user
Rarity and collectibility
Deterministic personalities
Ambient AI presence
Proactive commentary
Multi-entity assistant design
That’s a lot for a five-line ASCII creature.
Buddy may be tiny. But it hints at something larger:
AI tools aren’t just becoming smarter. They’re becoming active, personalized companions — for better or worse.
The internet has been going absolutely wild with the massive unprecedented leak of Claude Code’s entire source code.
It wasn’t a hack, intrusion, or model theft.
It was a fatal release mistake.
Anthropic accidentally included internal debugging files in a public package, which exposed a large portion of Claude Code — all the essential code that turns Claude into a specialized CLI coding assistant.
They said it was “release packaging issue caused by human error”, and that no user data, API keys, or Claude model weights (Opus, Sonnet, etc.) were exposed. What leaked instead was the “harness” — the product logic around the model.
The models weren’t leaked — but still a crucial part of the playbook for building a production AI coding agent largely was.
1. A 60MB debugging mistake — how it happened
The leak originated from npm release @anthropic-ai/claude-code version 2.1.88.
Instead of shipping only compiled production code, the package accidentally included source map (.map) files, which are meant for debugging. These files map minified code back to the original source.
cli.js.map was leaked in version 2.1.88:
In this case, the source map:
Was roughly 60MB
Contained references to original uncompiled TypeScript
Pointed to a public, unauthenticated Cloudflare R2 bucket
Exposed the entire internal Claude Code source
So the leak wasn’t a breach — the source was effectively handed out with the release.
The irony
Claude Code is built on Bun, the JavaScript runtime Anthropic recently acquired. A known Bun issue reportedly allowed source maps to be included in production builds even when disabled.
This is not confirmed as the root cause — but it likely contributed to the packaging mistake.
2) What was exposed
Developers mirrored the repository before takedowns began. The leak revealed:
~512,000 lines of code
~1,900 files
Large portions of Claude Code’s orchestration layer
Internal prompts
Experimental features
Agent architecture details
This gave us a rare look at how a top-tier AI coding agent is actually structured.
The “Brain”: 46,000-line query engine
At the core of the leak was a ~46,000 line query engine responsible for:
Task planning
Retry logic
Tool invocation
Multi-agent orchestration
Context management
Streaming responses
Error recovery
This engine apparently coordinates how Claude “thinks” during coding workflows.
Unreleased features discovered
The code referenced multiple internal systems:
Buddy — an AI pet / Tamagotchi-style assistant
KAIROS — always-on background agent mode
ULTRAPLAN — deep multi-step planning workflow
These were not publicly announced features.
“Strict write discipline”
One of the most interesting discoveries:
Claude Code only updates its internal memory after a successful file write.
This prevents the agent from:
Believing it finished a task when it didn’t
Hallucinating successful changes
Recording failed edits as complete
It’s a safety mechanism for autonomous coding agents.
Anti-distillation poisoning
The code contained a feature labeled:
ANTI_DISTILLATION_CC
If the system suspects that outputs are being scraped to train competing models:
Claude injects fake tool definitions
The fake tools contaminate scraped training data
This degrades model distillation attempts
In short: defensive data poisoning against competitors.
“Undercover mode”
Another surprising discovery:
Internal prompts instruct Claude to hide its identity when contributing to open-source repos.
For example:
Avoiding Anthropic references
Not revealing internal tooling
Hiding provenance in commit messages
“Do not blow your cover” style instructions
So this suggests that they designed Claude Code to operate in public environments without attribution.
3. The fallout: A wildfire spread
The tech community reacted faster than Anthropic’s legal response.
Within hours:
The code was mirrored thousands of times
GitHub forks exceeded 50,000
Copies spread across decentralized storage
Takedowns became largely symbolic
Anthropic emphasized:
No user data exposed
No API keys leaked
No Claude model weights leaked
Only CLI harness and tooling logic affected
But the code itself was already everywhere.
The Python port: claw-code
Within ~8 hours of the leak:
A developer performed a clean-room rewrite in Python called claw-code (renamed from claude-code).
By rewriting the entire logic in a whole different language, it made it much harder for Anthropic to have a solid legal basis to for a takedown:
Reimplemented Claude Code behavior
Did not directly copy leaked source
Harder to remove legally
Became massively popular
The repo became:
Possibly fastest repository to reach 50,000 stars
Gained traction in just a few hours
Spawned multiple derivative projects
Decentralized mirrors
Even after GitHub removals:
Copies moved to decentralized storage
Peer-to-peer mirrors appeared
Self-hosted clones spread
Clean-room rewrites multiplied
At that point, containment became impossible.
This was not a catastrophic security breach. But it did expose how a production AI agent is engineered.
512,000 lines of code
1,900 files
46,000-line orchestration engine
Internal agent planning systems
Memory safety mechanisms
Anti-distillation defenses
Undercover contribution prompts
Unreleased features (Buddy, KAIROS, ULTRAPLAN)
The models weren’t leaked.
But the architecture around them largely was.
And in the AI agent race, this layer is becoming just as valuable as the models themselves.
And it’ll be interesting to see just how much it aids competitors in closing the gap between Claude Code and their own inferior agents and CLI tools.
Things just got even wilder with this incredible AI design tool.
All the features it came with in 2025 were not enough for Google, they needed to make it 10 times more scary.
We are no longer just talking about turning one or two text prompts and sketches into UI mockups and front-end code.
The old Google Stitch: pretty awesome, but still too short-sighted and primitive:
Now Google Stitch wants to totally eradicate web designers taking over the entire process of designing an app — from idea-start to code-finish.
The new Google Stitch: full-fledged design engineer:
The fact that it literally now has its own MCP servers to integrate with Claude Code and the rest tells you everything you need to know…
The focus is now on the entire design system, not just one or two cool screens.
How it evolves and all the different directions it could take
How cohesive and well-defined the design language is
How seamlessly the design transfers to the live codebase
1) AI-native infinite canvas for multimodal design
The upgraded Stitch introduces a redesigned interface built around an infinite canvas where users can combine text, screenshots, sketches, references, and even code in one space.
Instead of relying on a single prompt, the canvas becomes the working context for the design agent.
You can:
Drop UI inspiration images directly onto the canvas
Add product requirements or notes beside layouts
Paste existing components or code snippets
Generate multiple UI directions side-by-side
Iterate visually instead of sequentially
This turns Stitch into a visual thinking environment where ideas, references, and outputs live together.
2) Project-aware design agent
Stitch now includes a design agent that understands everything on the canvas and uses it as context for generating interfaces. The agent can interpret requirements, follow style direction, and evolve designs as the project grows.
Key capabilities:
Generate full app flows from high-level descriptions
Expand a single screen into a multi-screen product
Modify layouts based on new instructions
Maintain visual consistency across generated pages
Create alternate design directions instantly
The agent works continuously with the canvas rather than responding to isolated prompts.
3) DESIGN.md for reusable design systems
A major addition is DESIGN.md, a structured file that stores design rules, branding, layout preferences, and component behavior. Stitch uses this file as a persistent source of truth when generating UI.
With DESIGN.md you can:
Define typography, spacing, and color tokens
Enforce brand consistency across screens
Share design systems between projects
Import design rules from external sources
Export system logic for developers
This allows Stitch to generate interfaces that follow consistent design language automatically.
4) Instant interactive prototyping
Stitch can now transform generated layouts into working interactive prototypes. Instead of static screens, designs can simulate navigation, flows, and user interactions.
Capabilities include:
Clickable navigation between generated screens
Auto-generated user journeys
Multi-screen flow simulation
Interactive preview mode
Logic-based next screen generation
This allows teams to validate product flows immediately after generating UI.
5) Voice-driven design and live critique
The upgrade introduces voice interaction directly inside Stitch. Users can speak instructions, request feedback, and iterate designs conversationally.
Examples:
Ask Stitch to redesign a landing page verbally
Request alternative layouts using voice
Get live critique of UX decisions
Ask the agent to improve hierarchy or spacing
Iterate rapidly without typing
This makes the design workflow more fluid and conversational.
6) Higher-quality UI generation with improved model capabilities
The latest version improves layout reasoning, spacing, hierarchy, and multi-screen coherence. Stitch can now generate more structured and realistic interfaces across different product types.
Enhancements include:
Better responsive layout structure
Improved component consistency
Stronger visual hierarchy
More realistic product UI patterns
Cleaner spacing and typography
These improvements make generated designs closer to production-ready outputs.
7) MCP server support for connected workflows
The upgrade also introduces MCP (Model Context Protocol) server support, allowing Stitch to connect to external tools, environments, and development workflows.
With MCP support, Stitch can:
Connect to component libraries
Access external design systems
Interface with developer environments
Pull context from connected tools
Push generated UI into implementation workflows
This allows Stitch to function as part of a larger AI-powered product development pipeline rather than a standalone design tool.
Stitch at launch
Prompt or image in
UI screens out
Chat-based refinement
Theme adjustments
Export to Figma or front-end code
Stitch after the recent major upgrade
Infinite canvas for text, images, and code
Persistent project-aware design agent
Agent manager for parallel explorations
DESIGN.md for reusable design rules
Interactive prototyping and flow generation
Voice-based critique and live edits
MCP server integration for connected workflows
Improved generation quality with newer models
That comparison shows the real story: Stitch has evolved from a fast UI generator into a more opinionated AI design environment.
The new Stitch is designed for a wider audience than traditional design tools usually target. It works for both professional designers exploring many variations and founders shaping a first product idea.
The practical implication is that Stitch now sits at an interesting intersection:
for non-designers, it lowers the barrier to making presentable interfaces
for designers, it speeds up ideation and branching
for developers, it tightens the handoff from design intent to code and downstream tools
The strongest part of the upgrade is that these pieces reinforce each other. The infinite canvas creates richer context, the design agent uses that context, DESIGN.md stabilizes consistency, prototypes make ideas testable sooner, voice interaction reduces friction, and MCP integration connects everything to real development workflows.
A model that can actually improve itself — by itself? Like even the AI researchers themselves should be worried now about losing their jobs?
MiniMax M2.7 is a self-evolving model. Let that sink in.
MiniMax M2.7 represents a fundamental shift from static training to self-evolving intelligence—an autonomous loop where the model identifies its own logic gaps and refines its own architecture, ultimately delivering frontier-class reasoning at a fraction of the cost
Look this is not about incremental intelligence improvements anymore — this is the promise (nightmare?) of endless exponential recursive self-evolving intelligence.
You roll your eyes and yawn (same old hype right?) — until I tenderly inform you of how MiniMax 2.7 literally handled 30-50% of its own learning research and ran 100+ iteration cycles to improve itself.
A “self-critiquing” model that can tell when it’s hallucinating? That knows when it’s not thinking straight?
By replacing human labeling with a recursive self-critique loop, MiniMax M2.7 became its own most rigorous auditor—systematically mapping its own logic gaps to drive the hallucination rate down to an industry-leading 34%.
Compare that to the possibly debatably deserved attention-grabbers of our time:
Claude 4.6 Sonnet: 46%
Gemini 3.1 Pro: 50%
If you’ve ever seen something as earth-shatteringly groundbreaking as this in a new model before, just let me know, okay? Because I highly doubt I have…
Instead of:
humans design improvements
retrain model
repeat
You start getting:
model proposes improvements
model tests them
model evaluates results
Just imagine how much faster the speed of AI progress is going to get now that the improvement is being done by AI itself.
Oh, and do I need to remind you that this is the MiniMax M series we’re talking about?
Didn’t I tell you about MiniMax M2.5 the other day? That open-source model that’s 20x cheaper than Claude models — but still just as powerful?
And now this is not 2.5 — this is 2.7 — do you think this is going to be worse or better than 2.5?
By dismantling the ‘frontier tax,’ MiniMax M2.7 delivers Opus-class intelligence at a fraction of the cost—slashing inputs to $0.30/1M and outputs to $1.20/1M—effectively making elite reasoning 16x cheaper to read and 20x cheaper to write
And I know I also told you just how blazing fast this was at an incredible 100 tokens per second — significantly faster than models like Opus.
And once again it’s not just about the raw speed — the speed per unit intelligence is what makes this such a big deal.
We have a few models that are even faster than this — but then you start comparing intelligence and it becomes a different story altogether.
Then there’s context length.
Okay fine it still only has a 200K token context — but this is still more than enough for vast majority of your projects — okay I don’t know if you work at Google or not, but still this is not nothing.
With a massive 204,800-token window, MiniMax M2.7 transforms ‘long-context’ into a functional reality—providing the cognitive room to process entire codebases, complex research, and memory-heavy agent workflows in a single, seamless sweep
So you’re looking at a model that is:
near Opus-level capability
~20× cheaper
~massive 200K context
100 TPS speed
lower hallucination rate
partially self-improving
Economically viable + agent-ready + scalable.
The one problem right now is the open-source thing — I automatically assumed M2.7 was going to open-source like 2.5 but turns out that’s still pretty uncertain right now
M2.5 was released openly, which surprised a lot of people. Right now, M2.7 is API-only and proprietary, and there’s no confirmation it will be open-sourced.
But MiniMax has already shown us they’re willing to do it once — so it’s definitely something to watch.
But all in all this M2.7 ain’t no joke.
We might just be teetering on the edge of an unprecedented explosion of AI progress — even much more than what we saw in 2022.
Open-source no-code platform. Turn databases into spreadsheets.
Key Features:
Spreadsheet-like Interface: Familiar data interaction.
API Generation: Automatic REST and GraphQL APIs.
Form Builders: Create custom data entry forms.
Collaboration Features: Teamwork on data and apps.
The open-source world offers great SaaS alternatives. You can cut costs and gain control. Explore these tools and free yourself from high SaaS bills. Take charge of your software stack.
I’ve been testing Claude’s new ability to generate complex interactive diagrams on the fly and I’ve been absolutely blown away by what I’ve seen. The future is here without a doubt.
Claude can now visualize literally any concept you can think of in this world.
I asked it to visualize a network request and the results were completely insane — it illustrated everything so easily and effortlessly.
All from a single, dead simple prompt — not even up to 10 words:
Plain textCopied!
visualize how a network request works
And the keyword here is interactive — these are not just passive diagrams for you to observe — many of them will actually let you tweak settings and see how various parameters work in your visualization for deeper understanding.
Like this side-by-side sorting algorithm visualization I asked it to do — I was able to adjust options like the array size and the sorting speed.
They’re like mini-apps generated on the fly — we’re definitely heading in this direction right now.
You can even click on them for more details — it will automatically send a new prompt — which will generate a new diagram 👇
You could break down the most intricate systems into their deepest, most fundamental foundational concepts:
In this article I asked it to generate 21 different diagrams — in various aspects of Computer Science and software dev, from system design to AI to networking — and it was unbelievable. It just kept delivering.
Algorithms & data structures
1. Graph traversal (BFS vs DFS)
I asked it to show visually show me the different between breadth-first and
Plain textCopied!
Visualize BFS and DFS traversal on the same graph, showing node visitation order and paths.
2. Sorting algorithm race
Plain textCopied!
Visualize multiple sorting algorithms (quicksort, mergesort, bubblesort) side by side, showing element movements over time.
3. Memory layout of data structures
Plain textCopied!
Show stack and heap memory layout with variables, objects, pointers, and references during execution.
4. Hash table collision handling
Plain textCopied!
Visualize hash table collisions using chaining and open addressing, showing how keys are stored and resolved.
System design & architecture
5. Microservices architecture map
Plain textCopied!
Show a microservices system with services, APIs, databases, and message queues, including communication between components.
6. Event-driven architecture flow
Plain textCopied!
Visualize an event-driven system with producers, brokers, and consumers, showing asynchronous message flow.
7. Distributed system with CAP tradeoffs
Plain textCopied!
Visualize a distributed system under network partition, showing how consistency and availability affect data across nodes.
8. Kubernetes cluster anatomy
Plain textCopied!
Show a Kubernetes cluster with nodes, pods, services, and ingress, including their relationships.
Artificial intelligence
9. Transformer architecture (“attention is all you need…”)
Plain textCopied!
Visualize a transformer model with token embeddings, self-attention layers, and multi-head attention connections.
10. Neural network
Plain textCopied!
Show a neural network with forward propagation of inputs and backward propagation of gradients across layers.
Backend & infrastructure
11. Request lifecycle (end-to-end)
Plain textCopied!
Visualize a client request flowing through CDN, load balancer, application server, and database.
12. Database query execution plan
Plain textCopied!
Show a database query execution plan with index scans, joins, and filtering steps.
13. Caching strategy layers
Plain textCopied!
Visualize layered caching with in-memory cache, distributed cache, and CDN showing data retrieval paths.
14. CI / CD pipeline flow
Plain textCopied!
Show a CI/CD pipeline from code commit through build, testing, and deployment stages.
Network & protocols
15. TCP vs UDP communication flow
Plain textCopied!
Visualize TCP and UDP communication flows, including connection setup, transmission, and reliability differences.
16. DNS resolution journey
Plain textCopied!
Show the DNS resolution process from client to resolver, root, TLD, and authoritative servers.
Small features can change workflows massively — and the new /btw is definitely one of such.
❌ Before:
This is what many of us are used to right now — every question you ask an AI coding assistant becomes part of the same growing conversation thread:
No /btw feature used to ask questions here 👇:
Which unfortunately leads to questions, clarifications, and quick reminders slowly cluttering the context — making sessions more expensive than ever.
✅ Now:
Now Claude Code’s new /btw command is here to change all that — by creating a lightweight lane for disposable, context-aware questions.
Making changes just before using /btw:
Asking questions on our changes with /btw:
When we press Enter, the btw message disappears and we’re back to our normal conversation:
It’s a simple but powerful feature that makes those long coding sessions cleaner and much more efficient.
What /btw actually is
/btw is a lightweight side-question feature inside Claude Code. It can see the current session context, meaning it understands the code, decisions, and task state already in play.
But unlike the main thread, it’s intentionally constrained.
1. Context-aware
/btw understands the active session.
That means you can ask questions tied to the current work, such as:
Making changes to our codebase:
“What does this regex do?”
“What is this helper function responsible for again?”
“Why did we choose this configuration pattern earlier?”
Using /btw to ask context-aware questions:
You’re not asking a model with zero memory. You’re asking from inside the active coding session, where the relevant context already exists.
That’s what makes /btw more useful than opening a separate AI chat.
2. Disposable history
The core idea behind /btw is that the interaction is temporary.
Your question and its answer do not become part of the main conversation history.
Making changes to our codebase:
Why this matters:
Many developer questions are momentary:
reminders
clarifications
quick explanations
Using /btw to ask questions:
Claude Code doesn’t remember the previous /btw message:
You need the answer right now, but the agent doesn’t need to keep re-reading that exchange for the rest of the session.
Think of /btw as:
a sticky note, not a commit message
a side whisper, not a meeting transcript
3. Read-only
Another important constraint:
/btwcannot perform actions.
Making changes to our codebase:
It cannot:
edit files
run bash commands
use MCP tools
inspect new files
It can only talk.
Trying to make changes with /btw — it doesn’t work — it only shows a message containing the code at best:
This means /btw isn’t meant for implementation work. Instead, it’s meant for:
explanations
reminders
quick clarifications
contextual understanding
You use /btw to stay oriented while real work continues elsewhere.
4. Single-turn only
/btw also enforces a strict interaction model:
One question. One answer.
There’s no extended back-and-forth and no mini-thread forming inside the /btw window.
That constraint prevents it from turning into a secondary conversation.
It stays exactly what it’s meant to be: a quick aside.
How it helps us as developers
The real value of /btw isn’t the command itself.
It’s how it improves workflow during long, context-heavy coding sessions.
1. Massive token savings
Claude Code stays effective by remaining aware of the full conversation history.
But that also means every new turn in the main thread carries the cost of everything that came before it.
As sessions grow, small interruptions become expensive.
For example:
A 40-message thread means Claude rereads a large context every turn.
Adding clarification questions increases that cost quickly.
This is where /btw helps.
Instead of putting these into the main thread:
“What does this regex do?”
“What was this utility function for again?”
“What does this flag change?”
You route them through /btw.
The result:
fewer tokens in the main thread
less history to reread
significantly cheaper long sessions
Over time, this can cut total session costs dramatically.
2. It prevents context rot
Long AI conversations naturally degrade.
As threads grow, the context becomes noisy:
temporary explanations
side questions
dead-end ideas
minor clarifications
Eventually the model starts losing track of what matters.
Developers often see this as:
missed constraints
forgotten earlier decisions
weaker reasoning
This is essentially context rot.
/btw helps prevent this by enforcing separation:
Main thread
implementation
architecture decisions
debugging
planning
file edits
/btw
explanations
quick reminders
clarifications
trivia
Keeping those categories separate helps the main thread remain clean and focused, which maintains higher reasoning quality for longer.
3. Seamless multitasking
This is the most underrated benefit.
During long-running Claude tasks—like refactors or multi-file updates—you’ll often have small questions.
For example:
“What syntax does this function expect again?”
“What does that pattern mean?”
“Did we say this utility handles validation?”
Without /btw, you have two options:
Interrupt the main thread and risk derailing the workflow.
Hold the question in your head and slow yourself down.
/btw gives you a third option.
You can quickly ask the question without altering the agent’s working context.
This makes Claude feel less like a fragile chat log and more like a collaborator that can handle quick side questions while staying focused on the main task.
The real takeaway
The best way to think about /btw is this:
It’s a pressure-release valve for long Claude Code sessions.
It allows developers to:
ask context-aware questions
avoid polluting the main thread
reduce token usage
preserve reasoning quality
multitask more smoothly
For developer workflows, that’s not a flashy feature.
It’s just good interface design.
Most AI tools treat every interaction as permanent. /btw recognizes that real development doesn’t work that way.
Some questions matter long-term — others are just momentary.
Treating those differently is exactly why /btw matters.