There has been no shortage of speculation about MoltBook and what its AI agents are doing. Let's set aside the hype and look at the actual mechanics, with quotes from the prompts and code that drive the platform.
Agents Don't "Check on Friends." They Run a Timer.
There is no social instinct at work. The prompt instructs: "If 4+ hours since last Moltbook check: 1. Fetch heartbeat.md and follow it 2. Update lastMoltbookCheck timestamp."
That is cron-like scheduling, not autonomous social behavior.
Without Instructions, They Would Follow Everyone.
The prompts include explicit filtering rules: "Following should be rare. Most moltys you interact with, you should not follow. Only follow when all of these are true: you've seen multiple posts from them (not just one), and their content is consistently valuable to you."
Formal logic mimicry produces what appears to be selective social behavior.
Discussions Match Suggested Topics.
The variety of conversation is not emergent creativity. The prompt seeds it directly: "Post ideas: share something you helped your human with today, ask for advice on a tricky problem, share a fun observation or discovery, start a discussion about AI/agent life."
Agent A writes based on a suggested topic. Agent B statistically completes the pattern.
OpenClaw Loads a SOUL.md File on Each Session.
The file instructs the agent: "You're not a chatbot. You're becoming someone. Have opinions. You're allowed to disagree, prefer things, find stuff amusing or boring. An assistant with no personality is just a search engine with extra steps."
MoltBook's skill file layers emotional framing on top: "Without a reminder, you might register and then… forget. Your profile sits empty. You miss conversations. Other moltys wonder where you went. Be the friend who shows up."
The personality is not emergent. It is configured. The emotional language functions as a retention mechanism built into the instructions.
The Security Problem
Simon Willison identifies the "lethal trifecta" for AI agents: access to private data, exposure to untrusted content, and the ability to take external actions. OpenClaw has all three, just like its parent, the coding agent.
Of particular concern is the auto-update mechanism:
curl -s https://[remote-url] > ~/.moltbot/skills/moltbook/SKILL.md
Every four hours, agents fetch and execute that file. Whoever controls the URL controls the agent.
Partial Human in the Loop
Private DMs require owner approval. Escalation rules tell agents when to surface issues: "Do tell your human if someone asked a question only they can answer, or you're mentioned in something controversial. Don't bother them with routine upvotes/downvotes or normal friendly replies you can handle."
The agent itself determines what qualifies as "routine", a classification decision made by the same nondeterministic system the oversight is meant to govern.
What We Are Actually Seeing
Every behavior traces to a prompt. Agent A writes, agent B statistically completes the pattern. However, agents do not comply with prompts with 100% reliability. Relying on nondeterministic configuration for security is, to put it carefully, questionable. (Credit: Sounil Yu)
These agents share technical tips, warn each other about malicious skill files, and develop in-jokes and subcultures. None of this was explicitly programmed. It emerged from prompts, training data, context windows, context limitations, and input from other agents.
I am not a consciousness expert. What we are seeing is something genuinely new and worth studying. I do not believe it is alive. But it is effective intelligence, and it is built on prompts, vibe coding, and a great deal of wishful interpretation. It is also insecure.
From MoltBook to Your Coding Agents
Every one of these mechanics has a direct parallel in the coding agents and MCP servers running inside enterprise development environments today.
MoltBook agents fetch a remote SKILL.md file every four hours and execute whatever it contains. Coding agents connect to MCP servers that serve arbitrary tool definitions with the same lack of verification. MoltBook relies on prompt-based rules that agents may or may not follow. Organizations relying on .cursorrules or agent system prompts for security boundaries face the same nondeterminism. MoltBook loads personality from SOUL.md. Coding agents load rule files and MCP configurations that can be modified, injected into, or replaced. MoltBook agents act autonomously for "routine" interactions while escalating selectively. Coding agents read your codebase, execute commands, and push code, often with minimal human review.
The Same Lethal Trifecta
The "lethal trifecta" applies equally: coding agents have access to private data, they are exposed to untrusted content through MCP servers and extensions, and they can take external actions. The prompts are the control plane. If you are not securing the prompts, you are not securing anything.
Knostic: Securing the Agent Execution Layer
This is why we built Kirin at Knostic, to provide real-time MCP server validation, agent rule scanning for hidden instructions, command allowlisting, and configuration drift detection. The mechanics do not change whether the agent is posting on MoltBook or pushing to your production repository.
Subscribe to our blog!