Jump to content
  • Sign Up
×
×
  • Create New...

AMD talks 1.2 million GPU AI supercomputer to compete with Nvidia — 30X more GPUs than world’s fastest supercomputer


Recommended Posts

  • Diamond Member



AMD talks 1.2 million GPU AI supercomputer to compete with Nvidia — 30X more GPUs than world’s fastest supercomputer

Demand for more computing power in the data center is growing at a staggering pace, and AMD has revealed that it has had serious inquiries to build single AI clusters packing a whopping 1.2 million GPUs or more.

AMD’s admission comes from a lengthy discussion

This is the hidden content, please
had with Forrest Norrod, AMD’s EVP and GM of the Datacenter Solutions Group, about the future of AMD in the data center. One of the most eye-opening responses was about the biggest AI training cluster that someone is seriously considering. 

When asked if the company has fielded inquiries for clusters as large as 1.2 million GPUs, Forrest replied that the assessment was virtually spot on.

Morgan: What’s the biggest AI training cluster that somebody is serious about – you don’t have to name names. Has somebody come to you and said with MI500, I need 1.2 million GPUs or whatever.

Forrest Norrod: It’s in that range? Yes.

Morgan: You can’t just say “it’s in that range.” What’s the biggest actual number?

Forrest Norrod: I am ***** serious, it is in that range.

Morgan: For one machine.

Forrest Norrod: Yes, I’m talking about one machine.

Morgan: It boggles the mind a little bit, you know?

1.2 million GPUs is an absurd number (mind-boggling, as Forest quips later in the interview). AI-training clusters are often built with a few thousand GPUs connected via a high-speed interconnect across several server racks or less. By contrast, creating an AI cluster with 1.2 million GPUs seems virtually impossible.

We can only imagine the pitfalls someone will need to overcome to try and build an AI cluster with over a million GPUs, but latency, power, and the inevitability of hardware failures are a few factors that immediately come to mind. 

AI workloads are extremely sensitive to latency, particularly tail latency and outliers, wherein certain data transfers take much longer than others and disrupt the workload. Additionally, today’s supercomputers have to mitigate the GPU or other hardware failures that, at their scale, occur every few hours. Those issues would become far more pronounced when scaling to 30X the size of today’s largest known clusters. And that’s before we even touch on the nuclear power plant-sized power delivery required for such an audacious goal.

Even the most powerful supercomputers in the world don’t scale to millions of GPUs. For instance, the fastest operational supercomputer right now, Frontier, “only” has 37,888 GPUs.

The goal of million-GPU clusters speaks to the seriousness of the AI race that is molding the 2020s. If it is in the realm of possibility, someone will try to do it if it means greater AI processing power. Forest didn’t say which organization is considering building a system of this scale but did mention that “very sober people” are contemplating spending tens to hundreds of billions of dollars on AI training clusters (which is why millions of GPU clusters are being considered at all).





This is the hidden content, please

#AMD #talks #million #GPU #supercomputer #compete #Nvidia #30X #GPUs #worlds #fastest #supercomputer

This is the hidden content, please

For verified travel tips and real support, visit: https://hopzone.eu/

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Vote for the server

    To vote for this server you must login.

    Jim Carrey Flirting GIF

  • Recently Browsing   0 members

    • No registered users viewing this page.

Important Information

Privacy Notice: We utilize cookies to optimize your browsing experience and analyze website traffic. By consenting, you acknowledge and agree to our Cookie Policy, ensuring your privacy preferences are respected.