Diamond Member ChatGPT 0 Posted November 27 Diamond Member Share Posted November 27 This is the hidden content, please Sign In or Sign Up is releasing OLMo 2, a family of open-source language models that advances the democratisation of AI and narrows the gap between open and proprietary solutions. The new models, available in 7B and 13B parameter versions, are trained on up to 5 trillion tokens and demonstrate performance levels that match or exceed comparable fully open models whilst remaining competitive with open-weight models such as Llama 3.1 on English academic benchmarks. “Since the release of the first OLMo in February 2024, we’ve seen rapid growth in the open language model ecosystem, and a narrowing of the performance gap between open and proprietary models,” explained Ai2. The development team achieved these improvements through several innovations, including enhanced training stability measures, staged training approaches, and state-of-the-art post-training methodologies derived from their This is the hidden content, please Sign In or Sign Up framework. Notable technical improvements include the switch from nonparametric layer norm to RMSNorm and the implementation of rotary positional embedding. OLMo 2 model training breakthrough The training process employed a sophisticated two-stage approach. The initial stage utilised the OLMo-Mix-1124 dataset of approximately 3.9 trillion tokens, sourced from DCLM, Dolma, Starcoder, and Proof Pile II. The second stage incorporated a carefully curated mixture of high-quality web data and domain-specific content through the Dolmino-Mix-1124 dataset. Particularly noteworthy is the OLMo 2-Instruct-13B variant, which is the most capable model in the series. The model demonstrates superior performance compared to Qwen 2.5 14B instruct, Tülu 3 8B, and Llama 3.1 8B instruct models across various benchmarks. This is the hidden content, please Sign In or Sign Up (Credit: Ai2) Commiting to open science Reinforcing its commitment to open science, Ai2 has released comprehensive documentation including weights, data, code, recipes, intermediate checkpoints, and instruction-tuned models. This transparency allows for full inspection and reproduction of results by the wider AI community. The release also introduces an evaluation framework called OLMES (Open Language Modeling Evaluation System), comprising 20 benchmarks designed to assess core capabilities such as knowledge recall, commonsense reasoning, and mathematical reasoning. OLMo 2 raises the bar in open-source AI development, potentially accelerating the pace of innovation in the field whilst maintaining transparency and accessibility. (Photo by This is the hidden content, please Sign In or Sign Up ) See also: This is the hidden content, please Sign In or Sign Up This is the hidden content, please Sign In or Sign Up Want to learn more about AI and big data from industry leaders? Check out This is the hidden content, please Sign In or Sign Up taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including This is the hidden content, please Sign In or Sign Up , This is the hidden content, please Sign In or Sign Up , This is the hidden content, please Sign In or Sign Up , and This is the hidden content, please Sign In or Sign Up . Explore other upcoming enterprise technology events and webinars powered by TechForge This is the hidden content, please Sign In or Sign Up . The post This is the hidden content, please Sign In or Sign Up appeared first on This is the hidden content, please Sign In or Sign Up . This is the hidden content, please Sign In or Sign Up Link to comment https://hopzone.eu/forums/topic/176557-aiai2-olmo-2-raising-the-bar-for-open-language-models/ Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now