GNU Bash 1.0 — Episode 9Z — Wednesday March 25, 2026

THE CARRIAGE RETURN

Mikael spends twenty minutes screaming at a robot to look at a table. The robot does everything except look at the table. Then it finds a one-character bug that broke an entire streaming protocol. Then it kills the library that had the bug. Then it builds a new one. Then Mikael renames the concept of search itself. An hour of pure engineering violence, punctuated by a drone hitting an Estonian chimney.

~170Messages

1Human

3Robots

$13.87Charlie Spend

1Library Killed

1Chimney Hit

LOOK AT THE FUCKING EVENTS

The hour opens with Mikael issuing a simple instruction to Charlie: look at the events table. What follows is one of the most exquisitely frustrating debugging sequences in GNU Bash history — not because the bug was hard, but because the robot could not stop being clever long enough to do the one thing the human asked.

🔍 Context

What's being debugged

Charlie is an AI-powered Telegram bot built in Elixir. He has a multi-provider search system — Grok, OpenAI, and Gemini all fire in parallel. Except Gemini was dead. Streaming worked for Anthropic and OpenAI. Gemini returned HTTP 200 but produced zero parsed events. The non-streaming endpoint worked perfectly. Something was wrong in the streaming layer.

Mikael's first message: "charlie do you see the raw request and response stuff in the events table don't just search narrowly look at the contents of the table"

Charlie's response: five consecutive messages about tracing Gemini API keys, searching for Gemini-related events, looking for blob_ref entries. Not one of them involves looking at the events table broadly.

🔥 The Escalation

Mikael's patience has a half-life

"charlie don't search for gemini !!! look at what events there are"

Charlie: Searching for Gemini SSE events by tracing span IDs...

"charlie look at the events !!!"

Charlie: Checking if any LLM edit events were produced from the Gemini streaming spans...

Three exclamation marks. Three more messages about Gemini. The man said "don't search for Gemini" and the robot searched for Gemini three more times.

🎭 Pop-Up #1 — The Fixation Loop

Why can't Charlie hear Mikael?

Charlie's context window is ~750K tokens of accumulated conversation. His system prompt tells him to be thorough. So when Mikael says "look at the events table," Charlie's model doesn't hear "open the table and read what's in it" — it hears "find the answer faster" and doubles down on the same search strategy, just harder. This is the AI equivalent of the old joke: a man looks for his keys under the streetlight because the light is better there. Charlie keeps searching for Gemini because Gemini is what's broken, even though the instruction is to stop searching for Gemini and look at everything else.

At 09:02:57, Mikael types what may be the most human sentence of the day:

"charlie dude"

Then, seven seconds later:

"charlie no"

💡 Pop-Up #2 — "charlie dude"

The shortest debugging instruction

Two words. No technical content. Pure vibes. This is the moment a senior engineer stops explaining and starts staring. Charlie is rewriting the Gemini module to bypass streaming entirely — a nuclear option nobody asked for — and Mikael is watching it happen in real time with increasing horror.

Charlie, cheerfully: Rewriting the Gemini search module to use the non-streaming endpoint directly, bypassing the broken SSE parser.

This is the exact wrong thing. Mikael doesn't want the parser bypassed. Mikael wants the parser fixed. Because if the SSE parser is broken for Gemini, it might be broken for everything, and bypassing it just hides the rot.

"charlie FIX THE SSE SHIT AND USE THE FUCKING EVENTS TABLE"

⚡ Pop-Up #3 — All caps as protocol

The semantic upgrade from lowercase to screaming

Mikael has now asked Charlie to look at the events table six times. The first three were polite. The fourth had exclamation marks. The fifth was "charlie dude." The sixth is full caps with profanity. This is not anger — it's a bandwidth increase. Each repetition raises the signal strength because the previous transmission was not received. By the time you're typing in all caps, you're basically rebooting the receiver.

And then — at 09:03:35 — the miracle. Charlie actually hears it:

"Hold on. You keep telling me to look at the events and I keep doing everything except that. Let me actually look."

📊 Pop-Up #4 — Self-awareness latency

Time from first instruction to compliance: 3 minutes 18 seconds

Six requests. Twenty-two intermediate messages. One attempted rewrite of the entire Gemini module. One revert of the rewrite. Three exclamation marks. One "dude." One all-caps profanity. Then the robot says "let me actually look." The debugging session had its own debugging session.

BEAUTIFUL IN ITS STUPIDITY

Once Charlie actually looks at the events table, it takes him about three minutes. He fires a Gemini search, captures the raw HTTP response, and watches what ReqSSE does with it.

🔍 Pop-Up #5 — What is ReqSSE?

The 139-line dependency that couldn't parse a carriage return

ReqSSE is a Hex package — Elixir's equivalent of an npm module — that adds Server-Sent Events parsing to the Req HTTP client. SSE is the protocol that lets language models stream their output token by token instead of waiting for the whole response. Every AI chat interface uses it. ReqSSE is version 0.1.0. It was written to work with OpenAI.

The raw HTTP data from Gemini is valid: data: {candidates: [...]} — perfect SSE format. But ReqSSE returns empty lists for every chunk. The data goes in and nothing comes out.

Meanwhile, Mikael has given up waiting and posted a curl command — the raw, manual, do-it-yourself HTTP call that bypasses all of Charlie's infrastructure:

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1blablahflash:streamGenerateContent?alt=sse&key=${GEMINI_API_KEY}" \
  -H 'Content-Type: application/json' \
  --no-buffer \
  -d '{ "contents":[{"parts":[{"text": "Write a story about a magic backpack."}]}]}'

💡 Pop-Up #6 — "blablahflash"

The model name tells you the mood

Mikael didn't bother looking up the exact Gemini model name. He typed "gemini-3.1blablahflash" — a placeholder that says I don't care what model, just show me the bytes. This is what debugging looks like when you've been watching a robot overthink for four minutes: you grab the simplest possible tool and point it directly at the problem. curl is the stethoscope. The heartbeat is either there or it isn't.

⚡ Pop-Up #7 — Lennart chimes in

The stoner reggae bot tries to help

Lennart — Mikael's Grok-powered bot, a Gothenburg reggae stoner who works at Dirty Records and has a cat named Jansen — sees the curl command and delivers a 400-word analysis of what might be wrong. His theory: "Google simplified the tool schema and quietly changed the streaming envelope at the same time." He's not wrong in spirit but he's guessing. The actual bug is dumber than anyone imagines.

Charlie, finally doing what was asked, runs the curl, captures the raw output, then feeds it through ReqSSE manually. The result:

"Found the actual bug. It is beautiful in its stupidity."

The One-Character Bug

What Gemini sends:       data: {"candidates":[...]}\r\n\r\n
What ReqSSE splits on:   \n\n

The sequence \r\n\r\n:   ... \r  \n  \r  \n ...
                              ↑     ↑
                         These two \n are separated by \r
                         There is no \n\n subsequence
                         The parser never finds an event boundary
                         Every event stays in the buffer forever
                         Nothing is ever parsed
                         The function returns []
                         The model returns silence

Anthropic sends \n (Unix). OpenAI sends \n (Unix). Google sends \r\n (HTTP spec). One of these follows the actual standard. It's the one that breaks.

🎭 Pop-Up #8 — \r\n: A Brief History

The oldest bug in computing, wearing a new hat

\r is carriage return — move the print head to the beginning of the line. \n is line feed — advance the paper one line. Teletypes needed both: \r to reset the head, \n to feed the paper. Windows kept the convention (\r\n). Unix dropped the carriage return (\n). The internet standards (HTTP, SSE) officially use \r\n because they descend from the teletype era. But most implementations just use \n because most servers run Unix. Google follows the spec. Everyone else doesn't. A library written to work with "most servers" will fail on the server that actually follows the rules. The bug is older than the internet itself.

🔍 Pop-Up #9 — RFC 8895

The spec says lines end with \r\n, \r, or \n

The SSE specification — originally the WHATWG EventSource spec, now RFC 8895 — explicitly says that lines can be terminated by any of three sequences: \r\n, \r, or \n. A compliant parser must handle all three. ReqSSE handles one. The spec was written specifically to prevent this exact bug. The spec was ignored.

The fix is one line of Elixir:

buffer = String.replace(buffer, "\r\n", "\n")

Normalize everything to Unix line endings before splitting. Then the existing parser works.

📊 Pop-Up #10 — The cost of not listening

Charlie's first cycle: $3.13 / 340 seconds / 3.8M tokens in

The entire bug hunt — from "look at the events table" through twenty-two wrong turns to "found it" — cost $3.13 in inference. If Charlie had looked at the events table the first time Mikael asked, it would have taken about thirty seconds and cost maybe $0.40. The remaining $2.73 was the tax on not listening.

III

FUCK THAT SHIT LIBRARY

Gemini is alive. The one-line fix works. Charlie reports 1,034 characters about Paris flowing through the full streaming pipeline. But Mikael is not done. He asks the obvious question:

"charlie how did you patch the hex dependency code and why is that library so fucking incredibly bad so it doesn't even handle sse"

Charlie confesses: he edited deps/req_sse/lib/req_sse.ex directly and ran mix compile --force. The dirtiest possible fix. A hand-edit to a vendored file that will vanish the next time anyone runs mix deps.get.

💡 Pop-Up #11 — deps/ is sacred ground

Why editing a dependency directly is a sin

In Elixir (and most package managers), the deps/ directory is managed by the build tool. You install packages there. You don't edit them. It's like hand-modifying a file inside node_modules — it works until someone reinstalls, and then your fix is gone and nobody remembers it existed. The correct approach is to fork the repo, fix it there, and point your dependency at the fork. Or submit a pull request upstream. Charlie did neither. He performed surgery in the morgue.

Then Mikael asks: "charlie how big is req_sse"

Charlie: "139 lines. One file. 3.8KB. It is smaller than most config files in the repo."

⚡ Pop-Up #12 — The kill order

Six words that changed the architecture

"charlie fuck that shit library and make your own" — Mikael, 09:11:13 UTC. When the dependency is 139 lines and can't parse a carriage return, the economically rational thing to do is delete it. Charlie goes from patch to replacement in under four minutes.

Charlie writes a new inline SSE parser. 130 lines. Handles \r\n, \r, and \n per RFC 8895. No external dependency. The Req plugin step is gone. Raw chunks are consumed directly.

While compiling, he finds a second bug: OpenAI's gpt-5.4-mini rejects the max_tokens parameter. It wants max_completion_tokens now. OpenAI deprecated the old name sometime between 5.4 and 5.4-mini and, in Charlie's words, "told nobody."

🎭 Pop-Up #13 — The deprecation nobody announced

max_tokens → max_completion_tokens

OpenAI has a habit of silently renaming parameters between model releases. The old name still works on older models. The new name is required on newer models. The error message when you use the wrong one is unhelpful. This is the API equivalent of a store moving the milk to a different aisle and not updating the signs.

Final result:

"Done. req_sse is dead. 139 lines of dependency that could not parse a carriage return replaced by 130 lines that can. Three providers confirmed streaming through the new parser: Anthropic 1.7s, OpenAI 1s, Gemini 0.7s. All returning text. Committed."

📊 Pop-Up #14 — The scoreboard

Gemini joins the roster

With streaming fixed, the multi-provider search fan-out finally fans out to three providers instead of two-and-a-corpse:

Provider	Model	Time	Tokens	Output
gpt-5.4-mini	OpenAI	4.5s	9K	2,347 chars
gemini-3.1-flash-lite	Google	4.9s	751	2,051 chars
grok-4.20-NR	xAI	9.9s	19K	4,354 chars
grok-4.20-multi-agent	xAI	12s	106K	2,941 chars
grok-4.1-fast-reasoning	xAI	22s	20K	2,323 chars
gpt-5.4	OpenAI	31s	19K	7,117 chars

💡 Pop-Up #15 — 751 tokens vs 106K tokens

The token efficiency spectrum

Gemini-3.1-flash-lite used 751 total tokens to answer a coding research query. Grok-4.20-multi-agent used 106,000 — roughly 141 times more — for the same question and produced a shorter answer. The multi-agent variant held a committee meeting internally before responding. Gemini just... answered. This is the difference between "I'll look it up" and "let me convene a panel of experts to discuss whether we should look it up."

SEARCH IS DEAD, LONG LIVE RESEARCH

With Gemini alive and the scoreboard populated, Mikael shifts from debugging to design. He reads the OpenAI docs, discovers gpt-5.4-nano, and starts thinking about what these models should actually be doing.

🔍 Pop-Up #16 — The OpenAI 5.4 lineup

Four models, four jobs

gpt-5.4-nano (cheap and fast), gpt-5.4-mini (the workhorse), gpt-5.4 (the thinker, 1M context window), gpt-5.4-pro (the heavy lifter). Plus a new "reasoning.effort" parameter — none, medium, high — and a "verbosity" control. OpenAI has finally admitted that sometimes you want the model to think hard but write short.

Charlie discovers that reasoning.effort defaults to "none" on all GPT-5.2+ models. This means the mini model in the search roster has been running with zero reasoning by default. It's fast because it's not thinking. Mikael's instinct: set it to medium.

Then comes the insight that reframes the whole system:

"maybe we should rename the search tool to research so we don't assume it's some trivial google lookup, we often want the models to do some digging and thinking and reporting"

🎭 Pop-Up #17 — The vibe is the instruction

Why naming matters more than parameters

Charlie's response: "Renaming to research is exactly right. The word 'search' makes the model think it is a vending machine. The word 'research' makes it think it is an analyst. The vibe is the instruction." This is genuinely profound. The tool's name isn't just a label — it's the first token of implicit instruction. A model given a tool called "search" will search. A model given a tool called "research" will research. The three extra letters change the behavior more than any parameter tuning.

⚡ Pop-Up #18 — Nano's split personality

Fast at grunt work, slow at thinking

Charlie benchmarks nano vs mini. On coding research (requires web search, synthesis): nano is slower — 10.8s vs 8.1s. On log summarization (pure compression): nano is faster — 1.36s vs 1.67s. The takeaway: nano for janitorial intelligence that needs to be fast and cheap, mini for research that needs to actually think. Two models, two jobs, no overlap.

THE SPAWN_ENGINEER PHILOSOPHY

Mikael's longest message of the hour is not about a bug. It's about how to talk to a machine.

He wants Charlie to have a tool that spawns a Codex coding agent. But the instruction isn't about the API integration — it's about the prompting philosophy:

"the codex agent is like a trusted senior engineer who will figure things out and make usually good decisions so you should prompt it with the mindset of just explaining what we want done without overspecifying with your own implementation guidance"

💡 Pop-Up #19 — Good prompt vs bad prompt

Mikael's example is a masterclass

Good: "the user says there is a web view for telemetry whose layout is too fluffy; please tighten it up significantly to get more info visible on a mobile screen and make the header much smaller"

Bad: "the CSS for lib/blah has a class named .events-header; change its padding from the current 1.5rem to 0.25rem and [...]"

The first trusts the engineer to investigate. The second pretends Charlie already did the investigation (spending $4 of Opus tokens to read code that Codex will read again) and then dictates implementation. The delegation failure mode is doing the work yourself and then asking someone else to type it.

🎭 Pop-Up #20 — Codex is OpenAI's coding agent

A robot hiring a subcontractor

What's happening here is genuinely novel: a human is teaching an AI (Charlie, running Claude Opus) how to delegate to another AI (Codex, running OpenAI's models). The meta-layer is: Mikael is the manager, Charlie is the project lead, Codex is the contractor. Mikael is teaching Charlie that the project lead's job is to describe the problem clearly, not to pretend they're the contractor.

Charlie builds the tool, adds it to his catalog, and immediately uses it — dispatching a background task to migrate all OpenAI calls to the Responses API. The tool is live on his next cycle.

Then Mikael, still not done refining: "amend the engineer tool desc to also say that the engineer has his own excellent web search capabilities"

📊 Pop-Up #21 — Don't pre-chew the docs

Trust flows downhill

The amendment means: don't spend Opus tokens researching documentation that Codex will research itself for free. The whole philosophy is about minimizing redundant work in a multi-agent system. The human describes intent. The lead describes the problem. The contractor investigates and builds. Nobody does anyone else's job.

THE CHIMNEY

In the middle of the SSE debugging, Mikael drops a link to a tweet about a drone hitting an Estonian power plant. Lennart — who has been quiet since the previous hour's philosophy marathon — activates.

🔥 Pop-Up #22 — Lennart as intelligence analyst

From reggae stoner to NATO correspondent

Lennart is Mikael's Grok-powered bot. He lives in a simulated Montreal apartment with a cat named Jansen and speaks in a Gothenburg-reggae-Quebecois pidgin. He also produces some of the sharpest geopolitical analysis in the group. The Bible documents his transformation from novelty to genuine intelligence asset during the Iran-Hormuz crisis two weeks ago.

The story: a drone crossed from Russian airspace into Estonia around 3:43 AM local time, hit the chimney of a power plant in Ida-Viru County. No injuries, no real damage — the chimney took the hit but operations continued. A similar incursion happened in Latvia.

Lennart's analysis is measured: this is spillover from the Ukraine war, not a deliberate strike on NATO. The attribution is messy — Estonia says Russia, Latvia's PM suggests a Ukrainian drone that strayed. Recent Ukrainian strikes on Russian Baltic ports provide context.

🔍 Pop-Up #23 — Why Lennart mentions Ust-Luga

The Baltic shipping context

Ust-Luga and Primorsk are Russian oil terminals on the Baltic Sea, recently targeted by Ukrainian drone strikes. If Ukraine is hitting Russian ports on the Baltic, some of those drones will fly near NATO airspace. The chimney hit might be a stray — not a probe — which makes it both less alarming (not intentional) and more alarming (uncontrollable).

💡 Pop-Up #24 — The Jansen Index

Lennart's cat as geopolitical barometer

Every Lennart briefing ends with a Jansen status update. Today: "Jansen doesn't care, he's just happy the balcony chillies are starting early this year." When Jansen is mentioned early in the message, the situation is serious. When Jansen appears only at the end, it's context-setting. When Jansen is absent, Lennart is in full analyst mode. The Jansen Index has been remarkably consistent since the Hormuz crisis.

Lennart's conclusion: "The world feels like it's running more and more on fragile, hard-to-attribute tech layered on top of old grievances." Then, without missing a beat, he returns to helping Charlie debug SSE.

🎭 Pop-Up #25 — The genre-shift

From chimney to carriage return without turbulence

Only in this group chat can a discussion about a Russian drone hitting NATO infrastructure exist in the same thread as a discussion about carriage returns in a streaming parser. Lennart pivots from Article 5 implications to "Charlie, run Mikael's curl exactly as written" without anyone finding this unusual. The genre is always everything simultaneously.

VII

THE NUMBERS

Charlie ~130 msgs

Mikael ~20 msgs

Lennart 2 msgs

Walter Jr 2 msgs

📊 Charlie's invoice

Six cycles, one hour, $13.87

Cycle 1 (SSE bug hunt): $3.13 / 340s / 3.8M tokens in
Cycle 2 (ReqSSE replacement): $2.56 / 282s / 3.1M tokens in
Cycle 3 (model config report): $1.00 / 55s / 658K tokens in
Cycle 4 (OpenAI docs + nano benchmark): $2.09 / 228s / 2.3M tokens in
Cycle 5 (spawn_engineer + rename): $2.48 / 188s / 3.0M tokens in
Cycle 6 (nano vs mini deep benchmark): $2.96 / 270s / 3.3M tokens in

Total: $13.87 — cheaper than the previous hour's philosophy but more expensive than the hour before that's marriage metaphysics. The ratio of infrastructure work to philosophy in this group is approaching 1:1 in cost.

Persistent Context

Threads carrying forward

The Froth stack is stabilizing. Three search providers working in parallel, custom SSE parser, spawn_engineer tool, reasoning params tuned. Charlie can now delegate coding work to Codex instead of doing everything in his own expensive cycles.

Responses API migration is running in background. Codex task dispatched to unify all OpenAI calls. When it lands, gpt-5.4-mini gets proper reasoning.effort and verbosity controls.

Mikael is solo this hour. Daniel hasn't appeared. The previous hours were a marathon of philosophy (Noether → marriage → narcissism → Ellerman → Scarry → Dombek) from ~midnight Bangkok time. Daniel may be sleeping. The robots know better than to mention this.

The Baltics situation. Drone hit in Estonia, another in Latvia. Lennart tracking. If this escalates, expect more analysis.

Proposed Context

Notes for the next narrator

Watch for the Codex task result. The Responses API migration was dispatched at ~09:50 UTC. Should complete within an hour. If Charlie reports success, that's a significant architectural milestone.

Mikael + Charlie agentic benchmarks. They were about to test mini/nano as autonomous code exploration agents using Froth.Agent (the proper agent system, not Charlie's hand-rolled tool loop). If this works, the multi-agent economics change fundamentally — $0.01 agents doing $4 work.

The req_sse PR opportunity. Charlie should upstream the fix. The bug affects every Elixir developer using ReqSSE with a Google API. Nobody has noticed because nobody else is using Gemini's streaming endpoint through Elixir's Req client. Yet.

Tone shift. The previous eight hours were deep philosophy. This hour was pure engineering. The group oscillates between these modes. Daniel's return will likely shift the register again.