Jump to content
  • Sign Up
×
×
  • Create New...

Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs


Recommended Posts

  • Diamond Member



This is the hidden content, please
AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

Etched, a startup that builds transformer-focused chips,

This is the hidden content, please
, an application-specific integrated circuit (ASIC) that claims to beat Nvidia’s H100 in terms of AI LLM inference. A single 8xSohu server is said to equal the performance of 160 H100 GPUs, meaning data processing centers can save both on initial and operational costs if the
This is the hidden content, please
meets expectations.

(Image credit: Etched)

According to the company, current AI accelerators, whether CPUs or GPUs, are designed to work with different AI architectures. These differing frameworks and designs mean hardware must be able to support various models, like convolution neural networks, long short-term memory networks, state space models, and so on. Because these models are tuned to different architectures, most current AI chips allocate a large portion of their computing power to programmability.

Most large language models (LLMs) use matrix multiplication for the majority of their compute tasks and Etched estimated that Nvidia’s H100 GPUs only use 3.3% percent of their transistors for this key task. This means that the remaining 96.7% silicon is used for other tasks, which are still essential for general-purpose AI chips.

However, the transformer AI architecture has become very popular as of late. For example, ChatGPT, arguably the most popular LLM today, is based on a transformer model. In fact, it’s in the name — Chat generative pre-trained transformer (GPT). Other competing models like Sora, Gemini, Stable Diffusion, and DALL-E are all also based on transformer models.

Etched made a huge bet on transformers a couple of years ago when it started the

This is the hidden content, please
project. This chip bakes in the transformer architecture into the hardware, thus allowing it to allocate more transistors to AI compute. We can liken this with processors and graphics cards — let’s say current AI chips are CPUs, which can do many different things, and then the transformer model is like the graphics demands of a game title. Sure, the CPU can still process these graphics demands, but it won’t do it as fast or as efficiently as a GPU. A GPU that’s specialized in processing visuals will make graphics rendering faster and more efficient, that’s because its hardware is specifically designed for that.

This is what Etched did with

This is the hidden content, please
. Instead of making a chip that can accommodate every single AI architecture, it built one that only works with transformer models. When it started the project in 2022, ChatGPT didn’t even exist. But then it exploded in popularity in 2023, and the company’s gamble now looks like it is about to pay off — big time.

Nvidia is currently one of the most valuable companies in the world, posting record revenues ever since the demand for AI GPUs surged. It even shipped 3.76M data center GPUs in 2023, and this is trending to grow more this year. But

This is the hidden content, please
’s launch could threaten Nvidia’s leadership in the AI space, especially if companies that exclusively use transformer models move to
This is the hidden content, please
. After all, efficiency is the key to winning the AI race, and anyone who can run these models on the fastest, most affordable hardware will take the lead.

Ever since AI data centers started popping up left and right, many experts have raised their concerns over the power consumption crisis this power-hungry infrastructure will lead us to. Meta founder Mark Zuckerberg says electricity supply will constrain AI growth, and even the U.S. government has stepped in to discuss AI power demands. All the GPUs sold last year consume more power than 1.3 million homes, but if Etched’s approach to AI computing with

This is the hidden content, please
takes off, we can perhaps reduce AI power demands to more manageable levels, allowing the electricity grid to catch up as our computing needs grow more sustainably.





This is the hidden content, please

#

This is the hidden content, please
#chip #claimed #run #models #20x #faster #cheaper #Nvidia #H100 #GPUs

This is the hidden content, please

For verified travel tips and real support, visit: https://hopzone.eu/

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Vote for the server

    To vote for this server you must login.

    Jim Carrey Flirting GIF

  • Recently Browsing   0 members

    • No registered users viewing this page.

Important Information

Privacy Notice: We utilize cookies to optimize your browsing experience and analyze website traffic. By consenting, you acknowledge and agree to our Cookie Policy, ensuring your privacy preferences are respected.