This is a fan-made community site, not affiliated with the official OpenClaw project or Anthropic. github.com/openclaw/openclaw
release media memory providers codex

OpenClaw 4.5–4.11: The Lobster Learns to Create — Video, Music, and Memory That Remembers for You

OpenClaws.io Team

OpenClaws.io Team

@openclaws

April 11, 2026

9 min read

OpenClaw 4.5–4.11: The Lobster Learns to Create — Video, Music, and Memory That Remembers for You

Six days. Three releases. 4.5, 4.7, 4.10, 4.11.

If 3.31–4.2 was the siege — the lobster learning to defend itself — then 4.5 through 4.11 is what happens after the armor is on. You start making things. Remembering things. The lobster that used to just answer questions now shoots video, writes music, and pulls the right piece of context into your reply before you ask for it.

Video Generation Is a First-Class Tool

Until 4.5, video was a side project. You could wire something up through a plugin, but there was no shared tool name, no provider registry, no fallback chain. That changed.

video_generate is now a built-in tool. Agents call it the same way they call image_generate. The result comes back as attached media, delivered through whatever channel the conversation is running on — Telegram, Discord, Slack, iMessage, doesn't matter.

The bundled providers at launch: xAI (grok-imagine-video), Alibaba Model Studio Wan, Runway. 4.10 added Seedance 2.0 through the fal provider with full duration, resolution, audio, and seed support. 4.11 added URL-only asset delivery, reference audio inputs, per-asset role hints, and adaptive aspect ratio — so providers can expose richer modes without forcing enormous files into memory.

Auto-fallback across auth-backed image, music, and video providers landed in 4.7. Intent is preserved during switches. Size, aspect, resolution, and duration hints get remapped to the closest supported option instead of hard-failing. If one provider can't handle the request, the next one in the chain gets a translated version.

Music Generation, Too

Same release, same pattern. music_generate is a built-in tool with bundled Google Lyria and MiniMax providers. Async tracking with follow-up delivery when the audio finishes. Optional hints a provider doesn't support — like durationSeconds on Lyria — get ignored with a warning instead of killing the request.

Prefer to run everything locally? The bundled ComfyUI workflow plugin in 4.5 covers image_generate, video_generate, and workflow-backed music_generate against both local ComfyUI and Comfy Cloud. Prompt injection, optional reference image upload, live tests, output download — the full loop.

`openclaw infer`: One CLI for All Inference

4.7 landed openclaw infer as a first-class hub for provider-backed inference workflows. Model, media, web, and embedding tasks all live under the same command. Transcription supports per-request prompt and language overrides. Web search and web fetch behave the same way the agent runtime would run them.

If you've been stitching together one-off scripts to run inference outside the chat loop, this is the replacement.

Active Memory: The Lobster Starts Remembering

This is the one users are going to feel most.

Before 4.10, memory was something you had to ask for. "Remember that I prefer dark mode." "Search memory for that API key workflow." The lobster would do it, but only if you remembered to tell it.

Active Memory flips that. It's an optional plugin that runs a dedicated memory sub-agent right before the main reply on every turn. The sub-agent pulls preferences, past details, and relevant context into the prompt automatically. You don't have to remember to remember.

It's configurable: message-scoped, recent-scoped, or full context modes depending on how aggressive you want it. /verbose lets you inspect live what's being pulled. Advanced prompt and thinking overrides are there for tuning. Opt-in transcript persistence for when you need to debug a particular memory decision.

4.12 tightened it. Recall runs stay on the resolved channel even when wrappers like mx-claw are in play. Lexical fallback ranking improved. Active Memory results now sit on the hidden untrusted prompt-prefix path instead of going into the system prompt directly — so you can see exactly what the model received in gateway debug logs.

Codex Gets Its Own Provider

4.10 split Codex out of the OpenAI provider path. codex/gpt-<em class="italic text-slate-200"> models now use Codex-managed auth, native threads, model discovery, and compaction through a plugin-owned app-server harness. openai/gpt-</em> stays on the standard OpenAI provider.

Practical result: your Codex subscription stops stepping on your OpenAI API key. Auth profiles are isolated. Model listings come from the Codex catalog. 4.14 followed up with forward-compat support for gpt-5.4-pro, including Codex pricing and limits visibility before the upstream catalog caught up.

LM Studio Is a Bundled Provider Now

4.12 shipped a bundled LM Studio provider. Onboarding flow, runtime model discovery, stream preload, memory search embeddings — the full first-class path. If you run local models through LM Studio, you no longer have to configure it as a generic OpenAI-compatible endpoint and hope the capability detection works out.

Smaller Things Worth Mentioning

  • Arcee AI provider (4.7): Trinity catalog, OpenRouter support, onboarding guidance.
  • Gemma 4 support (4.7): explicit thinking-off semantics preserved through the Gemma compatibility wrapper.
  • Bundled Qwen, Fireworks AI, StepFun (4.5) — plus MiniMax TTS, Ollama Web Search, and MiniMax Search integrations.
  • Amazon Bedrock (4.5): inference-profile discovery and automatic request-region injection. IAM auth works through the credential chain without AWS_BEARER_TOKEN_BEDROCK exports.
  • Compaction provider registry (4.7): plugins can replace the built-in summarization pipeline. Falls back to LLM summarization on provider failure.
  • Persisted compaction checkpoints (4.7): Sessions UI branch/restore actions for inspecting and recovering pre-compaction state.

The Shape of This Cycle

Three themes, running in parallel:

  1. 1.Make things. Video, music, local workflow runners. The lobster that used to just answer questions now produces output that lives outside the chat.
  2. 2.Remember things. Active Memory turns memory from something you call into something that calls itself.
  3. 3.Cleaner boundaries. Codex gets its own lane. LM Studio becomes first-class. Provider auth stops bleeding across contexts.

None of this is a single flagship feature. It's a series of upgrades that individually look incremental and collectively change how the lobster feels. Ask for a video, you get a video. Mention a preference once, it sticks. Switch between Codex and OpenAI GPT, your auth doesn't collide.

Six days, three releases, a lobster that's moved from "it can do things" to "it does things for you."

Stay in the Loop

Get updates on new features, integrations, and lobster wisdom. No spam, unsubscribe anytime.