The answer, of course, is that it depends. But the Cloud Native Computing Foundation (CNCF) announced some major steps forward toward this goal at the KubeCom Europe conference last week in Amsterdam.
This was the biggest KubeCon ever with about 13,500 attendees, representing an 8% growth over last year, reflecting CNCF’s extraordinary success in establishing Kubernetes as the standard for container orchestration.
During the conference Keynote, CNCF announced that NVIDIA has joined CNCF as a Platinum Member, contributed software to key AI-related open source projects, and committed $4M funding for AI workload testing and certification.
CNCF categorizes AI workloads as to whether they are training large language models (LLMs), using LLMs (i.e. inference processing) or running AI agents.
They consider the majority of inference workloads will benefit from Kubernetes hosting, as cloud native container workloads did 10 years ago.
Agentic workload standardization is now undertaken by the Agentic AI Foundation, another Linux Foundation sub-foundation.
Kubernetes for Inference Processing
“Open AI and ChatGPT is probably one of the fastest growing services of all time,” said Chris Aniszczyk, CNCF CTO. “And they were able to scale that using Kubernetes for a lot of the inference-based workloads.”
OpenAI in fact published two case studies on the CNCF website publicly discussing their use of Kubernetes for key workloads.
“A lot of classic LLM training is done on customized bare metal, Slurm, and PyTorch,” Aniszczyk continued. “This is the classic HPC ecosystem. But a lot of people are using Kubernetes more and more for inference, which I think it’s extremely well suited for.”
To support inference processing standardization, Red Hat is contributing llm-d to CNCF, which is the inference engine developed by Neural Magic, a company Red Hat acquired last year.
“AI model training was developed largely by data scientists building their own specialized infrastructure,” said Brian Stevens, Red Hat SVP and CTO of AI. “The in-production scaling and operation of inference, however, is now becoming a CIO problem, and the language CIOs speak is Kubernetes.”
“Standard Kubernetes orchestration wasn’t designed for the highly stateful and dynamic demands of LLM inference,” Stevens continued. The llm-d project provides the architectural layer needed to treat LLMs like any other scalable microservice.”
Open Source Will Solve the Hardest AI Problems
During her keynote presentation, Erin Boyd, Sr. Director at NVIDIA, confirmed NVIDIA’s support the goal of establishing Kubernetes as the standard platform for AI applications.
“The future of AI is community driven and open,” she said. To that end, NVIDIA is donating the NVIDIA GPU driver, the KAI Scheduler, and the AI Cluster Runtime (AICR), and Dynamo to CNCF and pledging $4M in GPU hardware for development and testing.
Boyd said NVIDIA verified their configurations against the Kubernetes AI conformance test suite for inference that was announced at the November, 2025 KubeCon as a key driver of the AI standardization process, just as the Kubernetes conformance program was for establishing the Kubernetes standard.
Boyd noted that over the past ten years, Kubernetes evolved from being just infrastructure to become the de facto programmable control plane for modern distributed infrastructure.
She sees AI workloads on GPUs following a similar path to standardization. “Because the hardest problems ahead are not just model problems, they’re infrastructure problems, scaling problems, interoperability problems, trust and transparency problems, and no single company can solve those alone.”
The open source community is “what made Kubernetes the foundation of modern infrastructure,” she added, “And it’s what will make AI the foundation of the next generation of compute.”
The Inference “Gold Rush”
During the KubeCon press conference, Jonathan Bryce, Executive Director of Cloud and Infrastructure, Linux Foundation, spoke about the “inference gold rush” underway.

Slide for press conference by CNCF
A CipherTalk report from February projects an almost 20% growth in the inference market year over year, resulting in a total market of $225B b y 2030, up from $106B in 2025.
Perhaps even more significantly, the report predicts that inference will represent 67% of all AI compute in 2026, up from 23% in 2023. They are seeing the valuations of inference companies skyrocket as a result, such as Baseten at $55B and Fireworks at $4B.
While 66% of gen AI workloads currently run on Kubernetes, he said, the Foundation’s market analysis estimates that the global AI economy could save $20–$48 billion per year by switching to open models.
In other words, the report says, without open models, consumers would spend between $350 million to $1.23 billion more than they currently are on LLM inference. The analysis provides a clear financial incentive for investing in open source for AI inference processing.
“Standards help organizations get the most out of their AI data,” Bryce added.
The Future of Agentic Workloads
CNCF identifies the third type of AI workload as agentic. To standardize AI agents, the Linux Foundation launched the Agentic AI Foundation (AAIF) in December with founding contributions from Anthropic (Model Context Protocol or MCP), Block (goose), and OpenAI (AGENTS.md).
“Agents are at a different layer of the stack,” said Aniszczyk. “They’re most likely going to be running on Kubernetes infrastructure, but how agents talk to each other, how they work the protocols such as MCP – we just consider that above the Kubernetes layer.”
The AAIF now has 170 members and sponsors the global MCP Dev Summit (upcoming in New York April 2-3) for open source collaboration on the future enhancement of the MCP. Enhancement areas include identity, trust, privacy, observability, and security,
“The AAIF is also looking at things such as ecommerce, trying to figure out how to get agents to buy things,” Aniszczyk added. “All of this will most likely happen at the AAIF layer, but agents have to run on something and be operationalized, and that’s where CNCF comes in.”

Photo of exhibit floor by CNCF
Graph Context for Inference
Two database vendors offer products to store and submit data graphs for context that improves the results of inference processing.
Neo4J is an established graph database vendor, and a pioneer in the space. Bur “AI has opened up a whole new set of use cases for graph technology,” said Stephen Chin, VP of DevRel at Neo4J.
For example, Neo4j stores the output of Microsoft GraphRAG solution to rapidly and accurately generate natural language summaries of entities and relationships found in the knowledge graph, Chin said.
And for another use case, Neo4J stores vectors in its databases. “Using a combination of vectors and graphs, we were able to show a dramatic performance in terms of the accuracy of the results,” Chin added. This knowledge graph approach overcomes performance and accuracy limitations of pure vector databases for inference processing, he said.
SurrealDB is an open source, multimodal database that combines graph, relational, vector, time series, and key-value modes into a single platform.
“Instead of using a specialist database, you can use one API and query across all different modes,” said Tobie Morgan Hitchcock, CEO and Founder. “And you can go back in time with the queries as well.”
Because the biggest challenge with gen AI is accuracy, you can “get the benefit of combining data models, reduce infrastructure cost, and dynamically generate knowledge graphs” to improve inference processing results, Hitchcock added.
SurrealDB is designed to accommodate constantly changing data and work at petabyte scale, Hitchcock said. “The knowledge graphs give you the ability to improve relevance,” he added.
Observability for AI
Observability vendors are adapting their products to AI, especially to agents which require additional tracking and logging for the actions they take, the decisions they make, and the API calls they make (including a chain of calls).
Sawmills for example offers a low cost, high scale telemetry solution for AI agents
“The problem today is too much data,” said Ronit Belson, CEO and Co-Founder. “Think about tomorrow – and maybe we are already there – when AI agents are writing the code and there’s so much more telemetry data.”
And a second problem is the quality of the data, Belson added. “On the one hand, coding agents are going to generate much more telemetry data, and on the other hand, they don’t care about the quality of the data.”
“We look at production data and create a feedback look that tells the coding agents how to do telemetry correctly so that the data quality is higher and more useful,” Belson added.
Chronoshphere, now part of Palo Alto, is designed and built for Kubernetes observability. They are adapting, as Kubernetes is, to support AI workload for inference and agentic workloads, said Martin Mao, Chronosphere co-founder and now SVP, GM of Observability at Palo Alto Networks.
Mao sees Chronosphere as complementary to Palo Alto’s platform. Training workloads are diminishing as the market shifts to inference and agentic workloads, he said. “In addition to observability tailored for those workloads, you also need an ID management system and security for agents. Palo Alto can now provide it all.”
Identity for Agents
Identity management is another area under adaptation for gen AI. Agents need IDs the way APIs need IDs but also need privileges and secrets.
Akeyless, for example, provides a one stop shop for secrets management, said Shahar Inbar, Akeyless VP of Sales.
“We provide a secured identity to either a machine or to a human,” Inbar said. “And nowadays, machine identities in the organization are growing around a hundred times more quickly than human identities, especially because of agentic AI.”
Akeyless offers a solution to secure the connection with an AI agent between the MCP and database, Inbar added. They are also offering a Privileged Access Management (PAM) solution on top of it, which also supports agentic AI.
The Intellyx Take
The Linux Foundation is embarking on a significant effort to standardize gen AI through open source collaboration and reduce costs, focusing especially on the inference market though CNCF and the agentic market through AAIF.
Success in standardization always hinges on adoption, which is what the conformance program did for Kubernetes.
It will take time to discover whether the Linux Foundation will be as successful in establishing Kubernetes as the standard deployment environment for inference workloads and whether the AAIF can also succeed in standardizing the agentic workload level above Kubernetes.
But it seems like they are off to a good start. And they have the opportunity to leverage the extraordinary success of Kubernetes and the CNCF, as evidenced in no small part by the increasing attendance numbers at KubeCon.
Chronosphere is a former Intellyx customer. All images provided to press and analysts by CNCF.
