Jump to content
  • Sign Up
×
×
  • Create New...

Nvidia and Mistral AI’s super-accurate small language model works on laptops and PCs


Recommended Posts

  • Diamond Member

This is the hidden content, please

Nvidia and Mistral AI’s super-accurate small language model works on laptops and PCs

data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///ywAAAAAAQABAAACAUwAOw==

Nvidia and Mistral AI have released a new small language model that purportedly features “state-of-the-art” accuracy in a tiny footprint. The new LM is known as the

This is the hidden content, please
, a miniaturized version of NeMo 12B that has been pruned from 12 billion to 8 billion parameters.

The new 8 billion-parameter small language model was shrunken down through two different AI optimization methods, said Bryan Catanzaro, VP of deep learning research at Nvidia,

This is the hidden content, please
. The team behind the new LM used a process that combines pruning and distillation. “Pruning downsizes a neural network by removing model weights that contribute the least to accuracy. During distillation, the team retrained this pruned model on a small dataset to significantly boost accuracy, which had decreased through the pruning process.”

These optimizations enabled the developers to train the optimized language model on a “fraction of the original dataset” resulting in up to 40x cost savings in terms of raw compute. Normally, AI models have to balance between model size and accuracy, but with Nvidia and Mistral AI’s new pruning and distillation techniques, language models can have the best of both worlds.

Mistral-NeMo-Minitron 8B, armed with these enhancements, purportedly leads nine language-driven AI benchmarks featuring a similar size. The amount of computing power saved is enough for laptops and workstation PCs to run Minitron 8B locally, making it faster and more secure to operate compared to cloud services.

Nvidia has designed Minitron 8B around consumer-based computer hardware. The LM is packaged as a Nvidia NIM microservice, and the AI model is optimized for low latency, which improves response times. Nvidia provides its custom model service, AI Foundry, to take Minitron 8B and manipulate it to work on even less powerful systems, such as smartphones. Accuracy and performance won’t be as good, but Nvidia claims the model would still be a high-accuracy LM, requiring a fraction of the training data and compute infrastructure that it would otherwise need.

Pruning and distillation appear to be the next frontier for artificial intelligence performance optimization. Theoretically, there’s nothing preventing developers from applying these optimization techniques to all current language models, which would significantly boost performance across the board, including large language models that can only be powered by AI-accelerated server farms.



This is the hidden content, please

#Nvidia #Mistral #AIs #superaccurate #small #language #model #works #laptops #PCs

This is the hidden content, please

This is the hidden content, please

For verified travel tips and real support, visit: https://hopzone.eu/

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Vote for the server

    To vote for this server you must login.

    Jim Carrey Flirting GIF

  • Recently Browsing   0 members

    • No registered users viewing this page.

Important Information

Privacy Notice: We utilize cookies to optimize your browsing experience and analyze website traffic. By consenting, you acknowledge and agree to our Cookie Policy, ensuring your privacy preferences are respected.