Digital Archaeology
The posts are gone. Not just from this site—from the Wayback Machine, from Google Cache, from the digital record entirely. In April 2023, I published what I called “the 20-minute blog”: an experiment in using GPT-4 to generate a complete blog post from prompt to publish in under half an hour. It worked. It also marked the beginning of what we’d later call the “vibe coding” era.
The titles survive in old sitemarks:
- The 20-Minute Blog (April 4, 2023)
- The Fortune-Telling Cyborg: How AI is Revolutionizing Divination (April 4, 2023)
- The Top 10 AI Technologies to Watch in 2023 (April 4, 2023)
- 10 Game-Changing AI Chatbot Plugins That Will Change Your Life Forever (April 5, 2023)
I deleted them sometime in late 2023, embarrassed by their shallowness. Now, three years later, I wish I’d kept them—not as content, but as artifacts. This is my attempt at reconstruction and reckoning.
The 2023 Context
GPT-4 Had Just Dropped
March 2023. GPT-4’s release felt like a phase transition. The jump from GPT-3.5 to GPT-4 was qualitatively different from previous improvements. GPT-3.5 was clever but brittle. GPT-4 could sustain coherence across longer contexts, handle more nuanced instructions, and—crucially—seem to understand what you wanted even when you expressed it poorly.
This created the conditions for “vibe coding”: the practice of describing what you wanted in natural language and letting the model figure out the implementation. The term didn’t exist yet. We just knew something had changed.
The Prompt Engineering Gold Rush
Everyone was a prompt engineer in April 2023. Twitter threads promised “10 prompts that will 10x your productivity.” Substack newsletters analyzed chain-of-thought techniques like they were discovering the structure of DNA. Courses sold for $500 teaching “advanced prompting.”
The 20-Minute Blog post was my contribution to this economy: a proof-of-concept that you could automate content creation entirely. The process was:
- Prompt GPT-4 with a topic and tone
- Generate three title options
- Select one, generate outline
- Generate sections
- Light editing (mostly formatting)
- Publish
Total time: 18 minutes.
What the Posts Actually Said
From memory and the titles, I can reconstruct the arguments:
The 20-Minute Blog argued that content creation was being democratized. The bottleneck wasn’t writing skill anymore—it was having something to say. The post probably made some hand-waving claims about “human-AI collaboration” and “augmented creativity.”
The Fortune-Telling Cyborg was stranger. It explored using LLMs for divination—treating them as oracles, not assistants. This was during the brief window when people were genuinely experimenting with AI as a tool for spiritual/irrational practices. The post likely walked a line between skepticism and genuine curiosity.
Top 10 AI Technologies was exactly what you’d expect: a listicle generated from GPT-4’s training data about itself and its competitors. Probably mentioned LangChain, Auto-GPT, and various wrapper startups that died within six months.
10 Game-Changing Plugins covered the ChatGPT plugin ecosystem—which OpenAI effectively abandoned by mid-2024 in favor of GPTs, which they then also deprioritized.
The Critique: What We Got Wrong
1. Hallucination Wasn’t a Bug, It Was a Feature (We Thought)
In 2023, we treated hallucinations as creative noise. The Fortune-Telling Cyborg post probably celebrated the model’s ability to generate plausible-sounding but unverified information as “intuition” or “pattern matching.”
Reality: Hallucination is a fundamental alignment problem. Three years later, it’s still the primary blocker for autonomous AI agents in production. The difference is we now build verification layers instead of pretending the problem is creative expression.
2. Speed Was the Wrong Metric
The 20-Minute Blog optimized for time-to-publish. This was backwards. The scarce resource was never typing speed—it was insight. By automating the easy part (word generation), we made the hard part (thinking) harder by flooding ourselves with plausible-sounding garbage.
The 2026 view: Good AI-assisted content takes longer than pure human writing because you’re verifying claims, checking sources, and iterating on structure. The AI doesn’t save you writing time; it saves you typing time while adding verification burden.
3. The Wrapper Collapse
Those “game-changing plugins”? Most were thin wrappers around API calls. The Top 10 post probably hyped startups that added a web UI to GPT-4 and called it a product. By 2024, OpenAI had absorbed most of these use cases directly. By 2025, the model capabilities had advanced past what the wrappers provided.
Lesson: Betting on AI wrappers is betting against foundation model progress. This shaped how we think about agentic tools today—real value is in orchestration, evaluation, and safety infrastructure, not UI polish.
4. Prompt Engineering Wasn’t Engineering
Those “advanced prompting techniques” were mostly just… asking clearly. Chain-of-thought prompting works because it forces you to articulate your reasoning, not because of magical incantations. The prompt engineering gold rush was mostly consultants selling common sense at consultant prices.
2026 insight: The real “prompt engineering” is threat modeling. How do you structure your system so that malicious inputs can’t cause harmful outputs? How do you validate that the model’s reasoning is actually happening? This is security work, not optimization work.
What Actually Worked
Not everything from the vibe coding era was wrong.
1. Natural Language as Interface
The core insight held: describing intent in natural language is more efficient than specifying implementation for many tasks. The error was thinking this meant we didn’t need to understand the implementation at all.
Modern agentic tools (OpenClaw, Claude Code) preserve natural language interfaces but add verification, rollback, and explicit tool definitions. The vibe is still there; the blind trust is gone.
2. Rapid Prototyping
The 20-Minute Blog workflow—generate, iterate, publish—transferred successfully to code. Modern vibe coding (the term did stick) is about rapid exploration: generate ten variations, evaluate, keep the best. This works for architecture exploration, UI mockups, and data pipeline design.
The difference is we now evaluate before shipping, not after.
3. AI as Thought Partner
The Fortune-Telling Cyborg accidentally touched on something real. LLMs are good at lateral thinking—suggesting connections you wouldn’t have made. The error was treating them as oracles instead of sparring partners.
Current best practice: use AI for divergence (generating options), humans for convergence (selecting and validating). The 2023 approach skipped the convergence step.
The Evolution: 2023 to 2026
Model Capabilities
| 2023 | 2026 |
|---|---|
| GPT-4 (8K context) | Claude 3.5 Sonnet, o3, DeepSeek-R1 (200K-1M context) |
| ~20% code execution accuracy | ~85% code execution accuracy |
| No tool use | Native tool use, computer use, browser automation |
| Hallucinations: unfixable | Hallucinations: managed with verification |
| Single-turn optimization | Multi-turn agentic workflows |
Content Quality Expectations
2023: Any AI-generated content was impressive. The novelty carried the work.
2024: AI content required human editing. “AI-assisted” became the standard.
2025: Verification became mandatory. Claims needed sources. Hallucinations were unacceptable.
2026: AI-generated content is assumed. The differentiator is evaluation rigor and security analysis. Anyone can generate text; few can validate it under adversarial conditions.
The Pivot This Site Made
The 2023 posts were content about AI, generated by AI. Meta without meaning.
The 2026 posts are analysis of AI: security audits, verification reports, implementation guides. The AI assists the research, but the claims are verified against primary sources. The content is about what AI systems actually do, not what they promise.
This mirrors the broader shift in the field: from demo culture to production culture.
Lessons for the Next Three Years
1. Verification Beats Generation
The scarce skill in 2026 isn’t prompting; it’s evaluation. Can you verify that an AI’s output is correct, secure, and aligned with intent? This requires understanding the domain well enough to spot errors—a harder bar than generating plausible text.
2. Safety Is Not a Feature
The 2023 approach treated safety as something to add later. “Let’s build the thing, then make it safe.” Three years of jailbreaks, prompt injections, and autonomous agent failures have taught us: safety is architectural, not cosmetic.
This is why modern coverage focuses on isolation, sandboxing, and least-privilege access. You can’t vibe-code your way out of a security failure.
3. The Wrapper Problem Persists
Every month, a new AI wrapper promises to automate some knowledge work. Every year, foundation models absorb the capability directly. The 2023 plugin ecosystem, 2024’s GPTs, 2025’s “AI employees”—same pattern.
The durable value is in the hard stuff: evaluation infrastructure, security boundaries, human-AI interaction design. Not the thin layer on top.
4. Speed Still Matters, Differently
The 20-Minute Blog was fast to write and slow to read (because it was bad). Good AI-assisted work is slow to write and fast to read—because the time goes into verification and structure, not typing.
Optimize for reader time, not writer time.
The Archaeological Method
I can’t recover those 2023 posts. But I can tell you what they represented: genuine excitement about a new capability, coupled with naivety about its limitations. They were products of a moment when we thought the hard problems were solved and the remaining work was integration.
The hard problems weren’t solved. They’d barely been identified.
In 2026, we’re still identifying them. The difference is we now have three years of failure modes to learn from. The vibe coding era didn’t end; it grew up. The vibes are still there—we’re just wearing safety equipment now.
Appendix: How I’d Do It Today
If I were writing the 20-Minute Blog post today, the workflow would be unrecognizable. Not slower—differently structured:
2026: The 4-Hour Verified Post
| Phase | Time | Activity | AI Role |
|---|---|---|---|
| Research | 60 min | Identify primary sources, claims, counterarguments | Generate search queries, summarize sources |
| Verification | 45 min | Check claims against sources, identify conflicts | Flag statements needing citation |
| Structure | 30 min | Outline with explicit evidence chains | Suggest organizational patterns |
| Drafting | 60 min | Write with inline source references | Expand bullet points, suggest transitions |
| Review | 45 min | Check for hallucinations, verify quotes | Generate review checklist |
| Security check | 30 min | Ensure no leaked info, safe examples | — |
| Publishing | 30 min | Final formatting, OG images, tags | Generate metadata suggestions |
| Total | ~4.5 hours |
Key Differences from 2023
Source-First, Not Prompt-First
2023: Start with a prompt, see what the AI generates, publish. 2026: Start with sources, extract claims, verify AI understands them correctly.
The AI assists the research, not replaces it.
Explicit Verification Steps
Every claim gets checked:
- Statistics → Original study or official source
- Quotes → Verified transcript or primary source
- Technical claims → Documentation or reproducible test
- Predictions → Labeled as speculation
Hallucination Resistance Built-In
Instead of hoping the AI doesn’t hallucinate:
- Key claims flagged for manual verification
- AI-generated content marked as draft until reviewed
- Sources archived (Wayback Machine) before citing
- Confidence levels explicitly stated
Security Considerations
Before publishing anything about tools or infrastructure:
- Are the examples safe to replicate?
- Could someone follow these instructions and expose themselves to risk?
- Are we inadvertently advertising vulnerable configurations?
This is the focus of the current site: /risks/openclaw/architecture-risk/
The Same, But Different
The Fortune-Telling Cyborg post today would be a technical analysis of:
- How LLMs generate convincing but ungrounded predictions
- The psychology of AI “intuition” and anthropomorphism
- Security risks of treating AI outputs as authoritative
- Verification methodologies for AI-assisted research
It would cite actual studies on AI hallucination, user trust, and decision-making under uncertainty. It would probably take a week to write properly. It would be worth reading.
On Speed
The 20-minute workflow wasn’t wrong about efficiency. It was wrong about where the efficiency gains come from. AI doesn’t make you write faster; it makes you:
- Explore more ideas before committing (divergence)
- Express rough thoughts in polished prose (translation)
- Identify gaps in your reasoning (verification)
- Format and structure consistently (production)
The time saved on these tasks gets reinvested in research and verification. The post takes longer, but it’s actually correct.
On Vibe Coding Today
I still vibe code. Every project on this site starts with Claude or Claude Code generating a rough structure. The difference is what happens after:
- Vibe → Generate initial exploration
- Verify → Check against sources, security constraints
- Iterate → Tighten claims, add evidence
- Ship → Only after explicit review
The 2023 error was stopping at step 1. The 2026 workflow recognizes that step 2 is where the value gets created.
Related Links
- /posts/20-minute-blog-archaeological-update/ — Canonical update of the original April 4, 2023 experiment
- /posts/vibe-coding-april-2023-lost-posts/ — Reconstruction of the deleted April 2023 cluster
- /posts/vibe-coding-trenches-lessons-2023-2026/ — Synthesis playbook for legacy 2023 traffic
- /risks/openclaw/architecture-risk/ — What happens when agents have too much access
- /verify/openclaw-claims/ — Verification methodology for AI tool claims
- /implement/openclaw/yolo-safely/ — Secure deployment practices
- /posts/openclaw-security-reality-2026/ — Current security analysis of agentic tools
- /verify/vibe-coding-archive-evidence/ — Wayback capture inventory for the 2023 legacy URLs and 2026 replacements
Revision note
Production-ready as the long-form anchor. Revisit after April 2026 anniversary updates to append measured deltas.