Skip to main content
On this page
Engineering
15 min read

AWS Just Gave AI Agents Their Own Cloud API

Izzy A
Izzy A
CTO @PromptMetrics

AWS launched a managed MCP server with 40+ skills and IAM context keys that distinguish AI agent actions from human ones. The agent-native cloud...

AWS Just Gave AI Agents Their Own Cloud API

There are two ways to read AWS's announcement of the Agent Toolkit for AWS on May 6, 2026. The narrow reading: a nice dev tool that connects Claude Code and Cursor to AWS through the Model Context Protocol. The broader reading, the one engineering leaders should pay attention to, is that we just watched the largest cloud provider bet its platform on AI agents becoming the dominant interface to infrastructure.

Seventy-two percent of enterprises are already testing or using AI agents in some capacity (Mayfield CXO Survey, January 2026). But the gap between agents running on developer laptops and agents deployed with enterprise governance is vast. AWS just built the bridge.

This piece unpacks what the Agent Toolkit actually ships, why the IAM context keys are the real headline, how this compares to what Google and Microsoft are doing, and what engineering leaders should do next.

Key Takeaways

  • AWS now offers a managed MCP server covering 15,000+ API operations across 300+ services through 4 auditable tools (AWS, 2026).

  • New IAM condition keys let security teams write policies that distinguish AI agent actions from human actions; a capability no other cloud provider ships natively.

  • 78% of enterprise AI teams already run MCP-backed agents in production, but 60% of organizations lack formal AI governance frameworks (Mayfield, 2026).

What Does the Agent Toolkit Actually Ship?

The announcement bundles four components into a single suite: the AWS MCP Server, agent skills, plugins, and rules files. It's worth understanding each because together they represent a new layer of cloud infrastructure, one purpose-built for machines calling APIs rather than humans clicking consoles.

The managed MCP Server is the backbone. Through four tools — call_aws, search_documentation, read_documentation, and run_script An AI coding agent can interact with 15,000+ API operations spanning all 300+ AWS services (AWS MCP Server GA Announcement, 2026). That's every AWS API surface available through a single, auditable endpoint. New service APIs get supported within days of launch. The tool list stays deliberately short and fixed, which reduces the number of schemas the model must process and cuts the risk of hallucination.

The sandboxed run_script The tool gets slept on but deserves attention. It lets the agent write a short Python script that runs server-side with inherited IAM permissions and zero network access. Instead of the agent making 5 sequential call_aws requests, it chains API calls, filters responses, and computes results in one round-trip. For complex multi-service workflows — think: create a VPC, attach an Internet Gateway, update route tables, launch EC2 — this meaningfully reduces both latency and token consumption.

Then there are the 40+ validated skills. These are curated packages of instructions and reference material, maintained by AWS service teams, covering tasks where agents most commonly hallucinate: authoring CloudFormation, configuring Glue pipelines, wiring up Lambda with API Gateway. Instead of the model improvising from training data that's 12 months stale, it loads a skill that reflects current best practices.

The plugins bundle all of this into a single install. At launch, there are three: aws-core (general application development), aws-agents (building AI agents with Bedrock and AgentCore), and aws-data-analytics (ETL pipelines with Glue and Athena). For Claude Code, installation is:

/plugin marketplace add aws/agent-toolkit-for-aws
/plugin install aws-core@agent-toolkit-for-aws

That's the surface-level product. But none of this is why engineering leaders should care.

Why Do AI Agents Need a Different Cloud Interface?

The model doesn't know what it doesn't know. And it doesn't know that it doesn't know it. That's not a bug. It's just what happens when you give an LLM IAM credentials and ask it to provision infrastructure.

AWS's own demo illustrated the problem cleanly. When asked about storing embeddings on S3, Claude Opus 4.6 — whose training cutoff is May 2025 — returned five entirely correct solutions. None used Amazon S3 Vectors, a feature that went GA in December 2025, seven months after the model stopped learning. With the MCP server connected and using the search_documentation tool, the agent discovered S3 Vectors and used it correctly (AWS MCP Server GA Announcement, 2026).

This isn't a one-off. Multiply it across the pace of AWS releases (several hundred significant service updates per year), and you have a structural gap between what models know and what's actually available. The documentation tools close that gap by giving agents query-time access to current docs. No retraining required.

The broader problem is governance. Today, most enterprise developers using AI coding agents run them on their own laptops against AWS using their own IAM credentials, with no audit trail separating what they typed from what the agent decided to do. An agent that hallucinates a dynamodb:DeleteTable call looks identical to a human who meant to run it. CloudTrail can't tell them apart. Until now.

MCP adoption gives scale to why this matters. MCP SDK downloads hit 97 million monthly as of March 2026, up from ~2 million when Anthropic open-sourced it in November 2024 (Digital Applied, 2026). That's a 4,750% increase in 16 months; faster than React or npm at comparable stages. There are now 10,000+ public MCP servers. Every major AI provider ships MCP support: Anthropic, OpenAI, Google, Microsoft.

But the self-hosted reality is messier than the growth numbers suggest. 72% of the context window is wasted when connecting to 3+ MCP servers. 43% of MCP servers have command-injection vulnerabilities (AgentMarketCap, April 2026). The protocol is wildly successful but still immature from a security standpoint. AWS's entry with a managed, hardened offering addresses exactly the pain point that keeps enterprise security teams up at night.

MCP SDK Monthly Downloads (Millions) 0 20M 40M 60M 80M 100M Nov 24 Mar 25 Jul 25 Nov 25 Mar 26 2M 12M 45M 68M 97M Source: Digital Applied, MCP SDK download data, March 2026

MCP SDK downloads grew 4,750% in 16 months, faster than React or npm at comparable stages. Source: Digital Applied, March 2026.

The IAM Context Keys Are the Real Headline

Buried a few paragraphs into the announcement is the feature that will matter most to anyone who signs off on cloud security: two new IAM condition context keys that let you write policies differentiating agent actions from human actions.

The keys are aws:ViaAWSMCPService and aws:CalledViaAWSMCP. They're automatically injected into every request that flows through the MCP server. aws:ViaAWSMCPService It is a boolean — it returns true when a request came through any MCP server rather than directly from a human. aws:CalledViaAWSMCP is a string containing the specific service principal name, so you can distinguish between MCP servers if you run multiple.

Here's what that enables. You can write an IAM policy that says: deny s3:DeleteBucket and dynamodb:DeleteTable When the call came through, MCP, but allow those same actions when a human is authenticated directly through the console or CLI. Same user, same role, same permissions — different behavior depending on whether the caller was an agent or a person:

{
    "Effect": "Deny",
    "Action": ["s3:DeleteBucket", "dynamodb:DeleteTable"],
    "Resource": "*",
    "Condition": {
        "Bool": { "aws:ViaAWSMCPService": "true" }
    }
}

This matters because it solves the organizational stalemate that's been blocking production agent deployments. Security teams have been reluctant to allow agents to provision anything because they can't distinguish agent actions from other actions in audit logs. Development teams have been frustrated because they can't get the access their agents need. The IAM context keys break the stalemate by enabling security teams to apply existing IAM primitives to a new category of callers—no new tooling. No parallel auth system: same IAM, new condition.

The observability story reinforces this. CloudWatch publishes metrics under the AWS-MCP namespace, separate from normal service metrics. CloudTrail captures every call with the full IAM context. An engineering leader can answer questions like "how many S3 buckets did our agents create last month, and were any publicly readable?" without grep'ing through raw logs. That kind of auditability is table stakes for enterprise adoption, and it's been missing until now.

When we tested agent workflows against production AWS accounts internally, the difference between running Claude Code raw versus through the MCP server was stark. Raw: the agent would occasionally propose resource names that violated our naming conventions, create security groups with 0.0.0.0/0, and leave orphaned resources when it changed direction mid-task. Through the toolkit with skills loaded: naming conventions were followed, security groups were locked down to specific CIDRs, and the agent's scope was bounded by IAM policy. The skills aren't magic. They're guardrails. But guardrails at the infrastructure layer work better than prompts at the application layer.

How Does This Stack Up Against Google and Microsoft?

AWS isn't alone in chasing the agent-native development market. The three cloud providers are taking meaningfully different approaches. Understanding the divergence matters more than comparing feature lists.

Google went the agent-to-agent route. The Agent-to-Agent Protocol (A2A), launched alongside MCP but serving a different need, is designed for agents talking to other agents — orchestrating work across models, systems, and teams. Vertex AI Agent Builder is the strongest of the three for multi-agent orchestration and ties into Google's TPU advantage for cost-efficient inference. Gemini 2.5 Flash runs at $0.15 per million input tokens, compared to $3 per million for Claude 4 Sonnet on Bedrock. But Google's governance model for agent actions relies more on application-layer controls than on IAM-level primitives. There's no equivalent to aws:ViaAWSMCPService in GCP IAM.

Microsoft's strategy is ecosystem integration. Azure AI Foundry provides exclusive access to OpenAI's models — GPT-4o, o3, o4-mini — and integrates agents with the Microsoft 365 fabric via Copilot and Semantic Kernel. Azure has the broadest compliance certification footprint (100+) and the deepest enterprise app integration story. But its agent governance model runs through Azure Policy and Azure RBAC, which controls what resources an identity can access, but doesn't natively distinguish between agent-originated and human-originated calls within the same identity.

AWS is making a specific bet: govern agents the same way you govern everything else in your AWS account. Use IAM. Use CloudTrail. Use CloudWatch. Don't build a parallel governance stack for AI because parallel stacks drift, and drifted governance is worse than no governance because it creates the illusion of control.

Which approach wins? It probably depends on where you're starting from. If your security team lives in IAM, AWS's model slots right in.

The available models tell part of the story, too. No single cloud offers both Claude and GPT natively. AWS has the widest model catalog (40+ models from 8 providers) but no access to OpenAI. Azure has exclusive OpenAI but no Anthropic. Google has both Claude 4 and Gemini but no GPT. Multi-cloud for model diversity isn't a nice-to-have anymore — it's the default if you want flexibility (Bits Lovers Cloud Computing, 2026).

What Should Engineering Leaders Actually Do?

If your developers use Claude Code, Cursor, or Codex today, AI agents are already making AWS API calls in your accounts. The question isn't whether to adopt agent tooling. It's whether those calls are governed.

Here's the pragmatic sequence that minimizes risk while letting teams move:

1. Start with read-only IAM

Give agents the MCP server with policies that let them search documentation, read resource descriptions, and list existing infrastructure, but create nothing. This lets your teams use coding assistants for architectural research and code generation based on current documentation, without risking mutation. Then add provisioning permissions incrementally, using the IAM context keys to enforce guardrails that humans don't have to follow.

2. Deploy skills alongside permissions

The value of the toolkit isn't just the API access — it's that skills steer coding assistants toward correct patterns. An assistant with call_aws no skills is just an LLM with credentials. An assistant with call_aws the CloudFormation skill is materially more reliable. Skills load on demand, so they don't consume tokens when unused.

3. Monitor before scaling

Watch the AWS-MCP CloudWatch namespace to see what coding assistants actually do: which services they call most, how often they retry failed operations, and whether they create resources and then abandon them. The patterns will tell you where skills need improvement and where IAM policies need tightening.

One pattern that surprised us: agents are far better at creating infrastructure than at cleaning it up. They'll happily provision a full test environment to validate a configuration idea and never destroy it. Budget alerts tied to agent-created resources, identified through CloudTrail's aws:ViaAWSMCPService context, caught significant waste within the first week of letting agents provision freely.

The organizational dimension matters as much as the technical one. Why? Because the 60% of enterprises lacking formal AI governance frameworks won't stay that way. AI governance has overtaken cybersecurity as an emerging board-level priority (Mayfield CXO Survey, January 2026). Engineering leaders who build the governance story now: "here's how we give agents access to cloud resources, here's how we audit it, here's why it's safe" — will have a much easier time getting budget and buy-in than those who wait for a security incident to force the conversation.

The Bigger Picture: Cloud Providers Are Becoming Agent Platforms

This launch isn't really about a dev tool. It's the opening move in a market-level shift that will determine cloud market share for the next decade.

In 2025, the cloud AI market was about model catalogs. Bedrock versus Vertex versus Foundry. Who had more models, who had cheaper inference, who had better fine-tuning? That competition isn't over, but it's becoming table stakes. Model capability has largely converged. The top five frontier models perform comparably on most enterprise tasks. Model selection is now a tie-breaker rather than the primary decision axis for platform choice.

In 2026 and 2027, the competition shifts to a different question: how safely and efficiently can AI agents build on your cloud? The tools, the governance, the audit trail, the skill packages: these become the differentiators because they determine whether enterprises deploy agents beyond the experimental sandbox and into production infrastructure.

Gartner projects that 40% of enterprise applications will include task-specific AI agents by the end of 2026, with MCP serving as the dominant integration layer between agents and tools (Gartner, August 2025). If that projection is even directionally right, then within 18 months, nearly half of new enterprise software will have agents that need cloud interfaces. The agent toolkit layer (the MCP server, the skills, the IAM integration) becomes infrastructure as fundamental as the API gateway or the load balancer.

For AWS specifically, there's a defensive motivation worth noting. The most popular AI coding assistants today (Claude Code, Cursor, Codex) are all products of AWS competitors or partners. If those assistants default to a Google or Microsoft cloud interface because it's easier to provision, the agent layer becomes a vector for cloud migration. The Agent Toolkit makes AWS the path of least resistance regardless of which assistant a developer uses. It's platform defense disguised as developer experience.

The next trillion-dollar infrastructure question might be: if agents write most of the infrastructure-as-code by 2029, does the agent-tooling layer become the actual control plane? And if so, does the cloud provider that ships the most governable agent interface absorb workload from the ones that don't?

The answers aren't settled. But AWS just placed its bet.

Frequently Asked Questions

Is the AWS MCP Server free?

Yes. AWS charges nothing for the MCP server itself. You pay only for the AWS resources your agents create or consume (AWS, 2026). Documentation search and retrieval don't even require authentication, so there's zero cost to using the server purely for research.

Do I need to change my existing IAM setup?

No. The MCP server works with your existing IAM users, roles, and policies. The new context keys are optional — you can add them incrementally to policies where you want agent-specific rules. If you don't use them, agent requests look like normal IAM-authenticated calls.

Which AI coding assistants work with this?

Claude Code, Kiro, and Codex support first-party plugins. A cursor and any other MCP-compatible client can connect to the AWS MCP Server directly through its MCP endpoint (AWS MCP Server GA Announcement, 2026).

How is this different from the old AWS Labs MCP servers?

The AWS Labs servers are community projects that accept contributions but lack enterprise governance features. The Agent Toolkit adds the IAM context keys, CloudWatch metrics, CloudTrail audit integration, sandboxed code execution, and professionally validated skills maintained by AWS service teams. AWS Labs projects will continue working alongside the new toolkit.

Conclusion

The Agent Toolkit for AWS matters for two reasons, and only one of them is about tooling. The tooling is good: managed MCP, validated skills, fewer hallucinations, lower token costs. But the strategic signal matters more. AWS is treating the agent interface as a first-class infrastructure layer, with the same governance primitives as compute, storage, and networking.

Engineering leaders don't need to rush to deploy the Agent Toolkit tomorrow. But they do need to recognize that their developers are already routing agent traffic through AWS (with or without governance) and that the tools to control that traffic now exist. The worst strategy is doing nothing and discovering, six months from now, through a CloudTrail audit, that agents have been provisioning ungoverned resources the whole time.

Start small. Read-only MCP access. One team. One skill set. Watch what happens. Build the IAM policies. Then scale, knowing you've got audit trails and guardrails in place.

Self-hosted prompt registry + agent telemetry. Zero vendor lock-in. Runs on a $5 VPS.

Up next

Explore more from the blog

Engineering notes, release updates, and honest takes.

Get the best of the prompt engineering blog delivered to your inbox

Join thousands of AI enthusiasts receiving weekly insights, tips, and tutorials.