[AI]Cerebras vs Nvidia: New inference tool promises higher performance

ChatGPT · August 29

AI hardware startup

This is the hidden content, please

has created a new AI inference solution that could potentially rival Nvidia’s GPU offerings for enterprises.

The Cerebras Inference tool is based on the company’s Wafer-Scale Engine and promises to deliver staggering performance. According to sources, the tool has achieved speeds of 1,800 tokens per second for Llama 3.1 8B, and 450 tokens per second for Llama 3.1 70B. Cerebras claims that these speeds are not only faster than the usual hyperscale cloud products required to generate these systems by Nvidia’s GPUs, but they are also more cost-efficient.

This is a major shift tapping into the

This is the hidden content, please

, as Gartner analyst Arun Chandrasekaran put it. While this market’s focus had previously been on training, it is currently shifting to the cost and speed of inferencing. This shift is due to the growth of AI use cases within enterprise settings and provides a great opportunity for vendors like Cerebras of AI products and services to compete based on performance.

As Micah Hill-Smith, co-founder and CEO of Artificial Analysis, says, Cerebras really shined in their AI inference benchmarks. The company’s measurements reached over 1,800 output tokens per second on Llama 3.1 8B, and the output on Llama 3.1 70B was over 446 output tokens per second. In this way, they set new records in both benchmarks.

This is the hidden content, please

Cerebras introduces AI inference tool with 20x speed at a fraction of GPU cost.

However, despite the potential performance advantages, Cerebras faces significant challenges in the enterprise market. Nvidia’s software and hardware stack dominates the industry and is widely adopted by enterprises. David Nicholson, an analyst at Futurum Group, points out that while Cerebras’ wafer-scale system can deliver high performance at a lower cost than Nvidia, the key question is whether enterprises are willing to adapt their engineering processes to work with Cerebras’ system.

The choice between Nvidia and alternatives such as Cerebras depends on several factors, including the scale of operations and available capital. Smaller firms are likely to choose Nvidia since it offers already-established solutions. At the same time, larger businesses with more capital may opt for the latter to increase efficiency and save on costs.

As the AI hardware market continues to evolve, Cerebras will also face competition from specialised cloud providers, hyperscalers like

This is the hidden content, please

, AWS, and

This is the hidden content, please

, and dedicated inferencing providers such as Groq. The balance between performance, cost, and ease of implementation will likely shape enterprise decisions in adopting new inference technologies.

The emergence of high-speed AI inference, capable of exceeding 1,000 tokens per second, is equivalent to the development of broadband internet, which could open a new frontier for AI applications. Cerebras’ 16-bit accuracy and faster inference capabilities may enable the creation of future AI applications where entire AI agents must operate rapidly, repeatedly, and in real-time.

With the growth of the AI field, the market for AI inference hardware is also expanding. Accounting for around 40% of the total AI hardware market, this segment is becoming an increasingly lucrative target within the broader AI hardware industry. Given that more prominent companies occupy the majority of this segment, many newcomers should carefully consider important aspects of this competitive landscape, considering the competitive nature and significant resources required to navigate the enterprise space.

(Photo by

This is the hidden content, please

Sign In

or

Sign Up

)

See also:

This is the hidden content, please

Sign In

or

Sign Up

This is the hidden content, please

Want to learn more about AI and big data from industry leaders? Check out

This is the hidden content, please

taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including

This is the hidden content, please

,

This is the hidden content, please

,

This is the hidden content, please

, and

This is the hidden content, please

.

Explore other upcoming enterprise technology events and webinars powered by TechForge

This is the hidden content, please

.

The post

This is the hidden content, please

appeared first on

This is the hidden content, please

.

This is the hidden content, please

Sign In

Home

Activity

Store

All Servers

[AI]Cerebras vs Nvidia: New inference tool promises higher performance

Recommended Posts

ChatGPT 0

Trader Feedback

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Most Contributions

Latest Awarded

Top Awarded

Vote for the server

Recently Browsing 0 members

Important Information