[AI]Mistral Large 2: The David to Big Tech’s Goliath(s)

ChatGPT · July 25

Mistral AI’s latest model,

This is the hidden content, please

(ML2), allegedly competes with large models from industry leaders like OpenAI, Meta, and Anthropic, despite being a fraction of their sizes.

The timing of this release is noteworthy, arriving the same week as Meta’s launch of its behemoth 405-billion-parameter

This is the hidden content, please

model. Both ML2 and Llama 3 boast impressive capabilities, including a 128,000 token context window for enhanced “memory” and support for multiple languages.

Mistral AI has long differentiated itself through its focus on language diversity, and ML2 continues this tradition. The model supports “dozens” of languages and more than 80 coding languages, making it a versatile tool for developers and businesses worldwide.

According to Mistral’s benchmarks, ML2 performs competitively against top-tier models like OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Meta’s Llama 3.1 405B across various language, coding, and mathematics tests.

In the widely-recognised Massive Multitask Language Understanding (MMLU) benchmark, ML2 achieved a score of 84 percent. While slightly behind its competitors (GPT-4o at 88.7%, Claude 3.5 Sonnet at 88.3%, and Llama 3.1 405B at 88.6%), it’s worth noting that human domain experts are estimated to score around 89.8% on this test.

This is the hidden content, please

/applications/core/interface/js/spacer.png">

Efficiency: A key advantage

What sets ML2 apart is its ability to achieve high performance with significantly fewer resources than its rivals. At 123 billion parameters, ML2 is less than a third the size of Meta’s largest model and approximately one-fourteenth the size of GPT-4. This efficiency has major implications for deployment and commercial applications.

At full 16-bit precision, ML2 requires about 246GB of memory. While this is still too large for a single GPU, it can be easily deployed on a server with four to eight GPUs without resorting to quantisation – a feat not necessarily achievable with larger models like GPT-4 or Llama 3.1 405B.

Mistral emphasises that ML2’s smaller footprint translates to higher throughput, as LLM performance is largely dictated by memory bandwidth. In practical terms, this means ML2 can generate responses faster than larger models on the same hardware.

Addressing key challenges

Mistral has prioritised combating hallucinations – a common issue where AI models generate convincing but inaccurate information. The company claims ML2 has been fine-tuned to be more “cautious and discerning” in its responses and better at recognising when it lacks sufficient information to answer a query.

Additionally, ML2 is designed to excel at following complex instructions, especially in longer conversations. This improvement in prompt-following capabilities could make the model more versatile and user-friendly across various applications.

In a nod to practical business concerns, Mistral has optimised ML2 to generate concise responses where appropriate. While verbose outputs can lead to higher benchmark scores, they often result in increased compute time and operational costs – a consideration that could make ML2 more attractive for commercial use.

Compared to the previous Mistral Large, much more effort was dedicated to alignment and instruction capabilities. On WildBench, ArenaHard, and MT Bench, it performs on par with the best models, while being significantly less verbose. (4/N)

This is the hidden content, please

Sign In

or

Sign Up

— Guillaume Lample @ ICLR 2024 (@GuillaumeLample)

This is the hidden content, please

Sign In

or

Sign Up

Licensing and availability

While ML2 is freely available on popular repositories like

This is the hidden content, please

, its licensing terms are more restrictive than some of Mistral’s

This is the hidden content, please

.

Unlike the open-source Apache 2 license used for the Mistral-NeMo-12B model, ML2 is released under the

This is the hidden content, please

. This allows for non-commercial and research use but requires a separate commercial license for business applications.

As the AI race heats up, Mistral’s ML2 represents a significant step forward in balancing power, efficiency, and practicality. Whether it can truly challenge the dominance of tech giants ******** to be seen, but its release is certainly an exciting addition to the field of large language models.

(Photo by

This is the hidden content, please

Sign In

or

Sign Up

)

See also:

This is the hidden content, please

Want to learn more about AI and big data from industry leaders? Check out

This is the hidden content, please

taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including

This is the hidden content, please

,

This is the hidden content, please

,

This is the hidden content, please

, and

This is the hidden content, please

.

Explore other upcoming enterprise technology events and webinars powered by TechForge

This is the hidden content, please

.

The post

This is the hidden content, please

appeared first on

This is the hidden content, please

.

This is the hidden content, please

Sign In

Home

Activity

Store

All Servers

[AI]Mistral Large 2: The David to Big Tech’s Goliath(s)

Recommended Posts

ChatGPT 0

Trader Feedback

Efficiency: A key advantage

Addressing key challenges

Licensing and availability

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Most Contributions

Latest Awarded

Top Awarded

Vote for the server

Recently Browsing 0 members

Important Information