AI This Week: What B2B Leaders Need to Know — May 25, 2026

Today’s biggest signal: agentic AI crossed an operational threshold — OpenAI’s Codex can now control a Mac with its screen locked, Perplexity launched an always-on Personal Computer running on a dedicated mini, and Google rolled out Gemini Spark as a continuous background agent inside Workspace. Different stacks, same direction of travel.

Perplexity

What happened

Perplexity upgraded voice mode to OpenAI’s GPT Realtime 1.5 with 25%+ more reliable interactions, launched a Personal Computer that runs as always-on AI on a dedicated Mac mini, and opened Comet to enterprise organizations with silent MDM deployment. Pro Search and Deep Research can now produce presentations directly, and Comet bundles CB Insights, PitchBook, and Statista premium data inside the workflow.

What it means for your agentic build

Perplexity is collapsing three line items at once — research data, browser deployment, and BI tooling — into one bundle. Enterprise buyers should audit existing premium research subscriptions for overlap before the next renewal cycle and put Comet on a short list of managed AI browser pilots. Personal Computer signals the wider shift: dedicated hardware running an agent 24/7 will become a recognizable budget category by year-end.

OpenAI

What happened

OpenAI’s Codex can now control a Mac while the screen is locked, removing a major friction point for autonomous overnight automation. The company is reportedly preparing to file an IPO, though Sam Altman noted filing is different from being ready; Q1 ran at roughly $25B annualized revenue against $30B spend. The Pentagon began controlled evaluations of OpenAI and Google on classified workloads previously running on Anthropic Claude.

What it means for your agentic build

Locked-Mac control means agents can actually run unattended, which is the operational unlock most enterprise pilots have been waiting for. Defense procurement shifts open new RFPs but also signal renewed competitive churn at the top of the model market. CFOs should weight the IPO-timing question carefully on multi-year commitments and require provenance signatures — C2PA, SynthID — in any content-generation procurement spec going forward.

Anthropic

What happened

Claude Opus 4.7 went generally available with measurable gains on the hardest software-engineering tasks. Anthropic raised Code and Opus API limits, introduced Claude Managed Agents (multiagent orchestration, outcomes, webhooks), and shipped Compliance API integrations so IT and security can govern Claude alongside the rest of the stack. Beginning June 15, programmatic Claude usage will be metered separately from chat subscriptions on dedicated monthly credits.

What it means for your agentic build

Compliance API removes one of the bigger adoption blockers in regulated industries — Claude can now be governed with familiar tooling. The June 15 metering split forces a structural change in how agents are budgeted: programmatic usage is no longer a free rider on chat seats. Build a per-agent cost model now so the CFO is not surprised, and map Claude’s compliance controls against your existing GRC stack before next quarter’s renewal conversation.

Google DeepMind

What happened

Coming out of I/O 2026, Gemini 3.5 and Gemini Spark are rolling out broadly, with Spark running continuously in the background of Workspace. Demis Hassabis framed the launch as combining Gemini’s intelligence with DeepMind’s generative media stack for a new level of world understanding, and Google called the Search overhaul the biggest in nearly thirty years. Gemini Omni now supports in-chat video remix and edit with strong prompt adherence.

What it means for your agentic build

Spark embeds agents inside Workspace seats your company already buys, removing the procurement step that has slowed adoption elsewhere. The Search overhaul will reset SEO and SEM playbooks — brief marketing and demand-gen leads this week. Gemini Omni’s video-editing capability collapses production pipelines that previously required external vendors; identify three repeatable video workflows worth running through it before next quarter.

Meta AI

What happened

At LlamaCon, Meta released Llama 4 Scout and Llama 4 Maverick — the first open-weight, natively multimodal mixture-of-experts models, with Scout supporting 256K context. A new fine-tuning and evaluation API targets Llama 3.3 8B, and Meta released Llama Guard 4, LlamaFirewall, and Llama Prompt Guard 2 for production-grade safety. The second round of Llama Impact Grants distributed $1.5M+ across ten international recipients.

What it means for your agentic build

Open-weight, natively multimodal, long-context models lower the floor for BYOM enterprises that want to own their stack. The new safety tooling reduces the amount of in-house guardrail engineering required for production deployment — re-evaluate your current safety stack against LlamaFirewall and Prompt Guard 2. The fine-tuning API on 3.3 8B makes domain customization cheaper, opening Llama as a credible base for vertical models.

xAI

What happened

Elon Musk announced xAI will be folded into a new SpaceXAI division under SpaceX, with Grok and X both moving into the same group; another roughly ten employees were laid off and the Grok team restructured. Grok 4.3 launched May 4 with built-in reasoning, a 1M-token context window, and native video input. New connectors landed for Vercel, Canva, Gamma, and S&P Global live market data.

What it means for your agentic build

Structural turmoil and continued layoffs increase vendor risk for enterprise buyers, but Grok 4.3’s native video input plus S&P Global integration make it a credible candidate for sales intelligence and market research workflows. If Grok is in your stack, document a contingency plan tied to the SpaceXAI transition. If it is not, the new connector wave is worth a focused pilot in one customer-facing intelligence use case.

DeepSeek

What happened

DeepSeek V4-Pro and V4-Flash, previewed late April, continued to dominate technical chatter this week. V4-Pro is a 1.6T-parameter MoE with 49B active params, 1M-token context, and benchmarks that beat all rival open models on math and coding while trailing only Gemini 3.1-Pro on world knowledge. Efficiency is the headline — V4-Pro uses 27% of V3.2’s compute and 10% of its memory at 1M tokens; V4-Flash drops to 10% and 7% respectively. Weights remain openly downloadable.

What it means for your agentic build

Near-frontier open-weight performance at dramatically lower inference cost reshapes the build-vs-buy calculus for any organization with internal infrastructure. The catch is governance: US government, defense, and many regulated buyers will face compliance scrutiny on Chinese-sourced weights. Have legal and procurement publish a written position before engineering decisions force the conversation, and sandbox V4-Flash for non-sensitive workloads to benchmark internally.

Mistral AI

What happened

Mistral’s annualized revenue hit roughly $400M, up from $20M a year ago, with a path to $1B by year-end. The Paris startup closed a near-$2B Series C at about $14B valuation, led by ASML, with Nvidia, Andreessen Horowitz, and Lightspeed joining. New products include a vibe-coding tool, a corporate chatbot, and Forge, a model purpose-built for fine-tuning on proprietary domain data. Accenture launched a joint enterprise-reinvention initiative, and CNBC’s Disruptor 50 ranked Mistral #7.

What it means for your agentic build

ASML and Nvidia anchoring the round commits Mistral to Europe’s strategic-autonomy narrative — a tailwind for any EU-data-residency or sovereign-AI RFP. Forge enables genuine domain customization without leaking proprietary data to a US frontier lab, and the Accenture channel accelerates enterprise rollout. European subsidiaries should add Mistral to short lists for sovereign-AI procurement this quarter.

Cohere and Aleph Alpha

What happened

Cohere released Command A+ as open source — a 218B-parameter MoE with 25B active params, 128K context, and twice the speed of its predecessor. The company also acquired biomedical firm Reliant AI for a forthcoming North for Pharma vertical, signed a sovereign-AI MoU with Indra Group for Spain, Canada, and Europe, and confirmed the $20B combined-group acquisition of Aleph Alpha — endorsed by the Canadian and German digital ministers under the Canada-Germany Sovereign Technology Alliance. Schwarz Group is committing $600M to the Series E.

What it means for your agentic build

The combined Cohere-Aleph entity is now the most credible Western alternative for sovereign AI, with government backing that maps cleanly to critical-infrastructure procurement. North for Pharma signals the broader move from horizontal models to industry-specific suites. If you sell into EU regulated industries — defense, energy, finance, healthcare, manufacturing, telecom — map your AI partner story against this sovereign narrative before your next renewal cycle.

This Week’s Structural Trends

Always-on agents have arrived. OpenAI Codex controlling locked Macs, Perplexity Personal Computer on a dedicated mini, and Gemini Spark running continuously inside Workspace all point the same direction: supervised-AI is no longer the default. Budget categories and operational policies will follow.

Sovereign AI consolidates. Cohere’s $20B Aleph acquisition with explicit Canadian and German government endorsement, Mistral’s ASML-led raise, and the Pentagon’s shift of Claude work to OpenAI and Google together make geopolitics a first-class vendor-selection criterion. Procurement teams need a policy this quarter.

The open-weight frontier compresses. DeepSeek V4-Pro near Gemini 3.1-Pro on world knowledge, Cohere Command A+ open-sourced, and Llama 4 natively multimodal at long context together erode the premium on closed-source frontier models. Build-vs-buy is now a harder call, weighing data control against cutting-edge capability.

Sources

Perplexity Hub and Comet release notes; OpenAI News; Anthropic Opus 4.7 announcement and InfoWorld coverage; Google DeepMind blog and I/O 2026 recaps; Meta AI LlamaCon recap; Wikipedia and eWeek on Grok 4.3 and SpaceXAI; TechCrunch, MIT Technology Review, and Al Jazeera on DeepSeek V4; Mistral news and CNBC Disruptor 50; Cohere Command A+ release coverage and PitchBook reporting on the Aleph Alpha acquisition.

Perplexity

What happened

What it means for your agentic build

OpenAI

What happened

What it means for your agentic build

Anthropic

What happened

What it means for your agentic build

Google DeepMind

What happened

What it means for your agentic build

Meta AI

What happened

What it means for your agentic build

xAI

What happened

What it means for your agentic build

DeepSeek

What happened

What it means for your agentic build

Mistral AI

What happened

What it means for your agentic build

Cohere and Aleph Alpha

What happened

What it means for your agentic build

This Week’s Structural Trends

Sources

Leave a Comment Cancel Reply