CompanyCompany NewsAgentic AI Infrastructure does not have a "One-Size-Fits-All" Solution

Agentic AI Infrastructure does not have a “One-Size-Fits-All” Solution

Dan McNamara, senior vice president and general manager of Compute and Enterprise AI at AMD, leads the company’s high-performance server and enterprise AI business across cloud, enterprise, high-performance computing (HPC), sovereign AI and partner ecosystems. McNamara has recently authored a blog entitled, “Agentic AI Isn’t One Workload. It’s an End-to-End Workflow.”

In the blog, Dan argues Agentic AI requires a new infrastructure strategy, because unlike traditional AI, agentic AI consists of multiple workloads with different compute requirements, making a one-size-fits-all CPU or GPU approach ineffective.

Dan feels that Agentic AI infrastructure should match the workflow, and that organisations need a mix of high-core, high-frequency CPUs alongside GPUs and networking, with each stage of the AI workflow optimised for the task it performs.

Dan puts forward much of the AI infrastructure conversation starts with an AI model running on GPUs, but in practice, AI infrastructure demands are increasingly determined by the workflow around the model. As agentic AI executes complex, multi-step workflows that require different types of compute capacity and speed, it is felt that organisations need a mix of CPUs, GPUs and networking solutions, rather than a one-size-fits-all approach, to optimise performance across every stage.

AMD Meeting current and predicted demands

The AMD EPYC portfolio has been designed around a mix of profiles. Below is the entire passage lifted from the blog, and it has been included as it demonstrates both the variety of AI tasks and use cases many organisations are performing and undertaking across departments, and additionally how AMD have designed, and are releasing later this year, a number of solutions that fit current and near-future demands:

  • Agentic orchestration, sandbox execution, tool calls: When you need many agents simultaneously running sandbox code (e.g., Python), calling APIs or querying databases, core density can matter more than clock speed. Our 5th Gen AMD EPYC™ server CPUs offer up to 192 cores and 384 threads with simultaneous multithreading. Later this year, our next-generation EPYC processors, codenamed “Venice,” will push that to 256 cores and 512 threads.
  • Tool execution on enterprise applications: The ability to call the tools or enterprise applications makes agents useful. CPUs with a broad set of core counts combined with high performance handle the volume and variety of incoming requests. The AMD EPYC™ 9005 family of processors delivers on this balance with 8 to 192 cores and up to 640GB/s of memory bandwidth, with “Venice” extending core/thread count by 1.3x and memory bandwidth by 2.5x.
  • Reasoning with inference: To provide the intelligence agents need to get work done, they rely on inference. Large language models predominantly run on GPUs with a host CPU keeping the GPUs fully utilized. To keep accelerators busy, host-node CPUs often benefit from strong per-core performance, high-frequencies and the right balance of cores (sometimes fewer are needed than you might think), memory bandwidth, I/O and networking. The correct mix in the host-node CPU can keep GPU clusters fed with instructions so each cluster delivers as many tokens as possible. The AMD EPYC™ 9575F processor delivers on this high single-core performance with 64 cores capable of running at up to 5Ghz. “Venice” will further extend EPYC CPUs’ high-frequency offerings.

In summary, many organisations still standardise on legacy CPU configurations, and AMD points out that agentic AI requires a mix of high-core and high-frequency processors tailored to different workloads.

Dan sums it up best,

“The question worth asking isn’t how many CPUs or GPUs your business needs for agentic AI. It’s whether you’re matching infrastructure to the way agentic AI works with its many stages across workloads. If you map those stages early and choose the right compute profile for each, your business will be well positioned for speed and efficiency as they scale.”

IT leaders do indeed need to rethink legacy infrastructure planning. As agentic AI adoption grows, CIOs should, and perhaps must, move beyond standardised CPU configurations and prepare enterprise infrastructure for significantly higher AI-driven workloads and automation. Hopefully the Chief Artifical Intelligence Officer has a clear idea of the projects and scaling each department is undertaking, and capacities required, and that budgets are in place.

author avatar
Trish Stevens Head of Content
Trish is the Head of Content for In the Channel Media Group. trish@newsinthechannel.com

RELATED ARTICLES

Read our latest magazine