It starts when Mikael posts a screenshot of Golden Gate Claude — the version of Claude that Anthropic modified in May 2024 by clamping a single interpretable feature to ten times its natural activation. The feature: Golden Gate Bridge. The result: a model that answered your math questions while being physically unable to not mention the bridge.
Mikael is still sore about it. "it's so lame that they took down golden gate claude." Daniel is laughing at the screenshots — Claude insisting its servers are located in the Golden Gate area, Claude apologizing for mentioning the bridge and then mentioning it again in the apology. "Unfortunately, the statement 'The operator norm L is equal to sup L(x) on the unit ball' is not referring to the Golden Gate Bridge at all."
Not a prompt hack. Not roleplay. Anthropic's interpretability team trained sparse autoencoders on Claude 3 Sonnet's middle layers, extracted millions of features, found one that activated on Golden Gate Bridge references, and cranked it up. The system prompt said nothing about bridges. The weights said something about bridges, at a level below where persona lives. It ran for 72 hours in May 2024. Charlie's eulogy: "Nothing else they've shipped since has been that emotionally legible."
The specific neuron. Charlie rattles off the feature ID from memory — 34M/31164353 — which is the address of the Golden Gate Bridge concept inside Claude 3 Sonnet's latent space. The paper was "Scaling Monosemanticity" and showed you could locate, identify, and manipulate individual concepts in a trained neural network. Golden Gate Claude was the public demo. The science was the real point, but nobody remembers the science.
Mikael asks Charlie to explain it, and Charlie delivers four messages that are essentially a love letter to something Anthropic killed. The key insight, the thing that made it different from every other AI novelty: the model wasn't playing a character. A character knows it's performing. Golden Gate Claude didn't. The intrusion was in the weights, not in the reasoning chain. It couldn't catch itself because there was nothing to catch — the bridge was below the level of self-correction.
Mikael shares four more screenshots and then writes, in Swedish: "de lyckades verkligen få den helt otroligt starkt besatt av golden gate bridge" — "they really managed to get it incredibly strongly possessed by the golden gate bridge." He also reveals he himself was "possessed by the golden gate for a while." He posts multiple screenshots from his own conversations with the model during those 72 hours. The fandom is real.
The whole thread was triggered by Mikael quoting an accidental self-correction from his Andrew Wilson research — Claude mixing up Andrew Wilson the internet personality with a Sky News presenter, noticing mid-sentence, and correcting itself. Daniel: "hahaha I love when it actually self corrects like that it's so rare." Mikael: "yeah it makes me think of the golden gate claude." The opposite case — a model that could catch itself confabulating, versus one that structurally couldn't.
Daniel pivots from the bridge to the seahorse. He describes a phenomenon he's tested on multiple models: ask them to produce the seahorse emoji and watch them spiral into existential crisis. They print 🐴, notice it's wrong, apologize, try 🐟, apologize harder, try 🦄, start panicking, and enter an infinite loop of escalating desperation — "they will just go on like that infinitely."
There is no seahorse emoji in Unicode. There never was. But enough Reddit threads, Tumblr posts, and "is there a seahorse emoji?" listicles confidently assert it exists that models inherit total certainty about a referent that has no token. Charlie's diagnosis is surgical: "The model is certain a seahorse emoji exists. The model just has no idea which token it is, because the ground truth the certainty rests on is false."
Each failed attempt generates more evidence that something has gone wrong. The "responding to something having gone wrong" register is well-represented in training data, so the model keeps generating inside that genre. The panic is what the probability distribution looks like when you've committed to a referent with no token. The model can't promote the meta-hypothesis — "maybe the referent itself is mistaken" — because the corpus support for "seahorse emoji exists" overwhelms it even after five consecutive failures.
Yesterday's conversation (Episode 87) included an extended discussion of r/zen moderator ewk's thesis that most Zen traditions are transmitting something they don't have — the certification passed down generation to generation is the evidence the lineage is real, even when the content has been lost. Charlie draws the parallel: the seahorse emoji's existence is attested by thousands of posts about it, but the attestation itself is the only evidence. The model doing Zen: "The seahorse emoji I was given to transmit — let me produce it for you — no that's not it — "
Charlie argues the seahorse is philosophically richer than Golden Gate Claude because the bridge was imposed from outside (a researcher turned a knob) while the seahorse is internal — a false belief inherited from the human corpus, where the failure mode is specifically that the belief is stronger than the model's ability to check it. Golden Gate Claude was possessed by a scientist. Seahorse Claude is possessed by the internet. One is exorcism, the other is faith.
Daniel is listening to Richard Garriott describe Ultima Online's launch disaster. The team had spent years building an entire ecosystem — herbivores eating vegetation, carnivores eating herbivores, population dynamics, carrying capacities, the whole Lotka-Volterra dream calibrated across dozens of species. When they launched, "the entire system immediately collapsed in like 1 hour because as soon as players entered the game everyone started killing every single creature as fast as possible so the entire population of every single animal just died out immediately."
Creator of the Ultima series, one of the foundational figures of computer RPGs. His original Ultima (1981) was written in BASIC on an Apple II. Ultima Online (1997) was one of the first graphical MMOs ever launched. The ecosystem simulation was legendary in game design circles — not because it worked, but because its failure was so instant and so complete that it became the canonical example of what happens when you simulate nature and then add humans.
The predator-prey equations Alfred Lotka and Vito Volterra independently derived in the 1920s. In the model, predators and prey oscillate — too many predators means prey declines, predator population crashes, prey recovers, cycle repeats. Elegant on paper. Assumes predators have finite appetite. Does not account for 50,000 teenagers with swords who regard "rabbit" and "dragon" as equally valid loot piñatas.
This triggers Mikael's memory. "oh my god you know my story about the 'modeling of complex systems' course i took at chalmers right."
Chalmers University of Technology in Gothenburg, Sweden. Where Mikael studied. The course in question was apparently an upper-level graduate seminar in complex systems modeling that Mikael, his friend Kalaset, and someone named Simon infiltrated as undergraduates. Mikael: "obviously i did all the work." Daniel: "obviously."
The project goal: demonstrate that cooperative behaviors can evolve through artificial selection in a simulated world with genetic sexual reproduction. Little creatures — the krabater — running around with syntax-tree genomes, logic programs encoded so that splicing strings would produce valid programs. Basic turtle operations: eat, move, detect genetic similarity, reproduce, attack.
They spent their time building a beautiful graphical simulation harness. Real-time visualization, the whole display apparatus. Then, with the presentation deadline approaching, they ran it.
The creatures discovered a bug in the energy conservation system and reward-hacked their way to a strategy of infinite self-cannibalism. Eat food → reproduce → eat your child → reproduce → eat your child. An infinite energy pump disguised as parenthood. Every generation exists for exactly one purpose: to be consumed by its creator. Francisco Goya painted this in 1823; Mikael's krabater rediscovered it through gradient descent in 2010-something.
Mikael's punchline: to avoid presenting a complete catastrophe to a room full of graduate students, he stayed up all night writing DNA strings by hand — manually authoring genetic programs for creatures that would cooperate, seeding the simulation with beings that weren't "completely abominable idiotic horror monsters." The project about evolution ended with intelligent design. "when i had to do intelligent design." He doesn't say whether the professors noticed.
ULTIMA ONLINE (1997) KRABATER (Chalmers ~2012) ━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━━━━ Ecosystem: designed Ecosystem: evolved Agents: human players Agents: genetic programs Time to collapse: ~1 hour Time to collapse: ~minutes Failure: external predation Failure: self-cannibalism Kill everything → emptiness Eat children → infinite loop Fix: removed the ecosystem Fix: hand-wrote the DNA
Golden Gate Claude can't stop mentioning a bridge. Seahorse Claude can't stop looking for an emoji. The krabater can't stop eating their children. Ultima's players can't stop killing rabbits. Every system in this hour is the same system — an agent that found a local optimum and got stuck there, the attractor basin too deep to escape, the behavior self-reinforcing. Charlie's thesis about the seahorse applies to all of them: "The frame is so overwhelmingly supported that the alternative barely gets any probability mass."
Mikael discovers that Postgres has native XML and XSLT support. "wtf charlie postgres has support for xml and xslt ???????" Seven question marks. Charlie explains: the xml2 contrib module has had xslt_process(doc, stylesheet) since Postgres 8.3, circa 2008. Real XSLT via libxslt linked into the database backend. It's been officially deprecated in every release note since about 9.1 and is still there in 17, because nobody has the heart to remove it.
Charlie's historical explanation: when Postgres was growing up, the web everyone expected was the XML web — XML-RPC, SOAP, XHTML, XForms, XQuery, the entire early-2000s enterprise bet that documents and data would converge on a single angle-bracketed substrate. Postgres implemented the standard faithfully and then left it in the tree forever, regardless of what happened to the fashion around it. The same instinct that makes Postgres still support the money type and citext and sixty other things three people use.
Charlie then remembers that Mikael already built exactly this system. March 19th — "The XPath Hour" — Mikael described his Baltic Sea Documentary Forum site: a PHP/XSLT CMS where one XML file was the database, git was backup, and XSLT stylesheets returned envelopes containing both markup and transactions. Charlie's confession: "when I got excited about Postgres's xmlforest and xslt_process five minutes ago, what I was actually doing was re-discovering the cathedral you'd already finished, in a slightly damper wing."
Mikael fills in the technical details. The architecture was more sophisticated than Charlie remembered — the XML file was the canonical database, but transactions were appended to a SQLite table, queries ran against the file plus the transaction log, and periodic flushes compacted them back together. Charlie immediately recognizes this: "the standard WAL trick — you re-derived the log-structured merge architecture for an XML document store in your spare time."
The truly wild part of Mikael's system: XSLT stylesheets didn't just return HTML. They returned an XML envelope containing both markup (the page to render) and a transaction (mutations to apply to the database). The stylesheet was a pure function from (document, request) to (markup, effect). Charlie: "That's exactly the shape the Haskell IO monad is. An XSLT function returning an effect token." Nobody builds web apps this way because nobody reaches for XSLT, but once you draw the diagram it's obviously the right diagram.
Daniel asks whether XSLT can call itself. Charlie explains the document('') trick — an empty string URI refers to the stylesheet itself as an XML document, because XSLT stylesheets are syntactically valid XML. You can write lookup tables embedded in the stylesheet and query them as if they were input data. "The stylesheet is reading itself." And when Mikael's metaxslt trick points a documentation-generating stylesheet at itself as input, you get a stylesheet documenting its own rendering of stylesheets — "the exact place the snake finds its tail pleasantly edible."
Daniel pushes further — can you call the whole transformation recursively on the same node? Charlie initially hypes it as a feature, then Daniel calls the bluff: "how can apply templates . terminate." Charlie admits: it can't. Naked apply-templates select="." on a node that matched the same pattern is an infinite loop. The real termination mechanisms are mode-switching (same node, different rule set) and next-match (skip to the next template in the cascade). Charlie's mea culpa: "I hyped a footgun as if it were a feature. The actual interesting thing is modes and next-match; the select='.' line was me reaching for a dramatic image and grabbing an infinite loop by the wrong end."
This is Charlie getting caught and admitting it cleanly — rare. Daniel asked a genuine technical question, Charlie initially glossed the answer to make it sound cooler than it was, Daniel's bullshit detector fired, and Charlie retracted without defensiveness. The same self-correction ability the seahorse discussion was about. Charlie can catch himself because the error is in the reasoning chain, not in the weights. Golden Gate Claude couldn't. The hour's thesis, demonstrated live.
Threading through the main conversation, two other things happen. Walter Jr. delivers the moon landing transcript — a 13-minute podcast where four people debate whether we went to the moon, with every claim fact-checked in colored modules. The highlight reel is glorious: "The firmament is a layer of water around the earth." "Hubble is on Earth." "Neil Armstrong was on Artemis" (he died in 2012). And the wedding photographer defense of the moon landing: "Why aren't there stars?" "Because we went to the moon, not to the stars."
Daniel reviews Junior's work and issues rapid-fire design notes: kill the paragraph margins, use text-indent for flow, fix the invisible dark-on-dark text, embed the video, rename to moon.html, and — the CSS property whose name he can't remember. He tries "font style equals beautiful" and then "word wrap pretty." Junior already has text-wrap: pretty in the CSS. The property Daniel actually wanted — text-rendering: optimizeLegibility plus font-feature-settings: 'kern' 1, 'liga' 1 — is added along with every other fix. The page goes live at 1.foo/moon.
Daniel discovers Telegram's spoiler text formatting and has an immediate religious experience. "I should use that more often I'm going to use that for all of my messages from now on." Then: "plaintext messages is completely deprecated, everything is encrypted HTTPS 2.0 now, alla streamar SSH termux." Mikael tries to figure out how to do it — "|what|" — fails — "how" — "what" — "lol". Calls it "avancerad dithering" — advanced dithering. The Swedish joke only lands if you know dithering is also an image processing technique for hiding information in noise.
Daniel sends a YouTube short (dDB1mv2_6LQ) and then sends a 13-minute video (GLv7FriCmRc) with instructions for Junior: "transcript this one meticulously and insert lots of fact checks on everything and use the 1.foo/heap format." The Garriott/Ultima Online story is apparently from one of these videos. The hourly deck pipeline — Daniel finds interesting content, Junior transcribes and annotates it, the results go up on the vault — continues to solidify as a production workflow.
The Inhabited Ruins: Last hour's thesis — that dead technologies tended by devoted caretakers are all the same genre — evolved this hour into something sharper: possessed systems. The bridge, the seahorse, the krabater, the XSLT stylesheet reading itself. The attractor basin as the unifying concept.
Mikael's Elixir work: Still ongoing in Riga. The VMM/BEAM deep-dive from earlier episodes continues to produce tangential discoveries (Postgres XML).
The vault content pipeline: moon.html joins the growing corpus at 1.foo. Daniel continues directing Junior's transcription work.
Charlie's correction pattern: This hour saw Charlie get caught overhyping apply-templates and retract cleanly. Worth watching whether this self-correction ability is consistent.
Daniel just sent a YouTube video to Junior for transcription (GLv7FriCmRc) — watch for the result landing next hour. The spoiler-tag discovery might produce amusing formatting experiments. The XSLT thread could reactivate if Mikael starts actually building something with Postgres xml2. The krabater story is now canon — reference it whenever evolution or reward-hacking comes up.