AI News Weekly:: Claude Controls Your PC, Sora Gets the Axe, and the Gemini Live Era Begins
This week has been an absolute whirlwind in the AI space. It feels like every time I blink, a new model drops or a major strategy shifts. If you’re feeling a bit overwhelmed, don’t worry—I’ve spent the week sifting through the noise to find the actual signals.
Let's dive into the biggest updates, from Anthropic’s breakneck shipping speed to OpenAI’s surprising "side quest" cancellations.
Anthropic: 74 Releases in 52 Days?
Anthropic has been on a literal tear. They’ve been shipping features so fast that the Product Compass actually mapped it out on a calendar. The most impactful update for most of us? Computer Use.
Available for paid users, this feature allows Claude to actually manipulate your mouse and keyboard. You can give it a task—like opening Da Vinci Resolve to find a specific tool—and watch it work.
- The Reality Check: While it’s incredibly cool to be "hands-free," it is currently painfully slow. A 10-second task might take 5 minutes.
- The Pro Tip: The real power comes when you combine this with the Dispatch feature on mobile. You can text your computer from your phone while you're out, tell it to handle a task, and let it click away while you’re grabbing coffee.
Coding Quality of Life
For the devs out there, Claude Code got an "Auto Mode." No more constant permission prompts for benign terminal commands or web searches. It’s a small change that makes a massive difference in workflow.
The All-in-One Contender: GenSpark
If you’re tired of paying $20/month for five different AI subscriptions, GenSpark is making a massive play. They’re offering an all-in-one workspace with unlimited usage of top-tier models (through 2026) for a flat $20 monthly fee. It handles everything from research and slide decks to full brand assets in one tab.
Google’s Multimodal Flex
Google isn't sitting this one out. They released Gemini 3.1 Flash Live, which is now integrated across their API, Search, and the Gemini app.
The most underrated feature here is the multimodal live conversation. You can share your screen (using OBS, for example) and have Gemini walk you through settings in real-time. It’s essentially the personal tutor we were always promised.
The "Vibecoded" Browser: Google also teased a experimental browser built with Gemini 3.1 Flash. You type a prompt (like "Taco Cat Parade"), and it generates a functional web page in real-time. It’s more of a novelty for now, but the speed of generation is a glimpse into the future of the web.
The Sound of AI: Music and Voice
This was a huge week for audio. We saw a "battle of the bands" between Google and the independent players:
| Tool | Key Update |
| Lyria 3 Pro (Google) | Now generates 3-minute tracks with structured verses/choruses. |
| Suno 5.5 | Introduced Voice Cloning. You can now train the AI on your own voice to sing your prompts. |
| Smallest.ai | A new 11-Labs competitor designed specifically for conversational agents (it even handles "ums" and "ahs"). |
| Voxrol (Mistral) | An open-weight text-to-speech model that you can run locally. |
Rapid Fire: The OpenAI Pivot
The biggest shocker? OpenAI is killing Sora. In a move to focus on their core strengths—coding and chat models—OpenAI is shutting down the Sora app, the video generator, and the API. This move also effectively ended their massive partnership with Disney. It seems OpenAI is trading "side quests" for a dedicated focus on Agentic Commerce.
They are leaning heavily into making ChatGPT a shopping destination, allowing users to compare products and businesses to list their feeds directly in the chat.
The "Claude Mythos" Leak
Finally, a leaked (and quickly deleted) blog post from Anthropic mentioned a new tier of model called Claude Mythos.
- The Claim: It’s larger and more intelligent than Opus, with "scary-strong" capabilities in cybersecurity and coding.
- The Catch: Anthropic reportedly warns it will be extremely expensive to serve and use.
Closing Thoughts
We’re moving away from AI as just a "chatbot" and toward AI as an agent—something that moves your mouse, shops for your shoes, and codes your apps while you sleep. It's a lot to keep track of, but that's why I'm here.
What do you think? Are you ready to let Claude take over your mouse, or is it still too slow for your workflow?
Comments
No comments yet. Be the first to share your thoughts!
Leave a Comment