Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

Pelican Press · June 26, 2024

This is the hidden content, please

Sign In

or

Sign Up

AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

Etched, a startup that builds transformer-focused chips,

This is the hidden content, please

, an application-specific integrated circuit (ASIC) that claims to beat Nvidia’s H100 in terms of AI LLM inference. A single 8xSohu server is said to equal the performance of 160 H100 GPUs, meaning data processing centers can save both on initial and operational costs if the

This is the hidden content, please

meets expectations.

(Image credit: Etched)

According to the company, current AI accelerators, whether CPUs or GPUs, are designed to work with different AI architectures. These differing frameworks and designs mean hardware must be able to support various models, like convolution neural networks, long short-term memory networks, state space models, and so on. Because these models are tuned to different architectures, most current AI chips allocate a large portion of their computing power to programmability.

Most large language models (LLMs) use matrix multiplication for the majority of their compute tasks and Etched estimated that Nvidia’s H100 GPUs only use 3.3% percent of their transistors for this key task. This means that the remaining 96.7% silicon is used for other tasks, which are still essential for general-purpose AI chips.

However, the transformer AI architecture has become very popular as of late. For example, ChatGPT, arguably the most popular LLM today, is based on a transformer model. In fact, it’s in the name — Chat generative pre-trained transformer (GPT). Other competing models like Sora, Gemini, Stable Diffusion, and DALL-E are all also based on transformer models.

Image 1 of 2

(Image credit: Etched)

Etched made a huge bet on transformers a couple of years ago when it started the

This is the hidden content, please

project. This chip bakes in the transformer architecture into the hardware, thus allowing it to allocate more transistors to AI compute. We can liken this with processors and graphics cards — let’s say current AI chips are CPUs, which can do many different things, and then the transformer model is like the graphics demands of a game title. Sure, the CPU can still process these graphics demands, but it won’t do it as fast or as efficiently as a GPU. A GPU that’s specialized in processing visuals will make graphics rendering faster and more efficient, that’s because its hardware is specifically designed for that.

This is what Etched did with

This is the hidden content, please

. Instead of making a chip that can accommodate every single AI architecture, it built one that only works with transformer models. When it started the project in 2022, ChatGPT didn’t even exist. But then it exploded in popularity in 2023, and the company’s gamble now looks like it is about to pay off — big time.

Nvidia is currently one of the most valuable companies in the world, posting record revenues ever since the demand for AI GPUs surged. It even shipped 3.76M data center GPUs in 2023, and this is trending to grow more this year. But

This is the hidden content, please

’s launch could threaten Nvidia’s leadership in the AI space, especially if companies that exclusively use transformer models move to

This is the hidden content, please

. After all, efficiency is the key to winning the AI race, and anyone who can run these models on the fastest, most affordable hardware will take the lead.

Ever since AI data centers started popping up left and right, many experts have raised their concerns over the power consumption crisis this power-hungry infrastructure will lead us to. Meta founder Mark Zuckerberg says electricity supply will constrain AI growth, and even the U.S. government has stepped in to discuss AI power demands. All the GPUs sold last year consume more power than 1.3 million homes, but if Etched’s approach to AI computing with

This is the hidden content, please

takes off, we can perhaps reduce AI power demands to more manageable levels, allowing the electricity grid to catch up as our computing needs grow more sustainably.

This is the hidden content, please

#

This is the hidden content, please

#chip #claimed #run #models #20x #faster #cheaper #Nvidia #H100 #GPUs

This is the hidden content, please

For verified travel tips and real support, visit: https://hopzone.eu/

Sign In

Home

Activity

Store

My Details

Forums

All Servers

Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

Recommended Posts

Pelican Press 0

Trader Feedback

This is the hidden content, please

Sign In

or

Sign Up

AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

Link to comment

Share on other sites

Join the conversation

Most Contributions

Vote for the server

Recently Browsing 0 members

Important Information

Sign In

Home

Activity

Store

My Details

Forums

All Servers

Sohu AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

Recommended Posts

Pelican Press 0

This is the hidden content, please Sign In or Sign Up AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs

Link to comment

Share on other sites

Join the conversation

Most Contributions

Vote for the server

Recently Browsing 0 members

Important Information

This is the hidden content, please

Sign In

or

Sign Up

AI chip claimed to run models 20x faster and cheaper than Nvidia H100 GPUs