Jump to content
  • Sign Up
×
×
  • Create New...

Recommended Posts

  • Diamond Member

Galileo, a leading developer of generative AI for enterprise applications, has released its latest

This is the hidden content, please
.

The evaluation framework – which focuses on Retrieval Augmented Generation (RAG) – assessed 22 prominent Gen AI LLMs from major players including OpenAI, Anthropic,

This is the hidden content, please
, and
This is the hidden content, please
. This year’s index expanded significantly, adding 11 new models to reflect the rapid growth in both open- and closed-source LLMs over the past eight months.

Vikram Chatterji, CEO and Co-founder of Galileo, said: “In today’s rapidly evolving AI landscape, developers and enterprises face a critical challenge: how to harness the power of generative AI while balancing cost, accuracy, and reliability. Current benchmarks are often based on academic use-cases, rather than real-world applications.”

The index employed Galileo’s proprietary evaluation metric, context adherence, to check for output inaccuracies across various input lengths, ranging from 1,000 to 100,000 tokens. This approach aims to help enterprises make informed decisions about balancing price and performance in their AI implementations.

Key findings from the index include:

  • Anthropic’s
    This is the hidden content, please
    emerged as the best overall performing model, consistently scoring near-perfect across short, medium, and long context scenarios.
  • This is the hidden content, please
    ’s
    This is the hidden content, please
    ranked as the best performing model in terms of cost-effectiveness, delivering strong performance across all tasks.
  • Alibaba’s Qwen2-72B-Instruct stood out as the top open-source model, particularly excelling in short and medium context scenarios.

The index also highlighted several trends in the LLM landscape:

  • Open-source models are rapidly closing the gap with their closed-source counterparts, offering improved hallucination performance at lower costs.
  • Current RAG LLMs demonstrate significant improvements in handling extended context lengths without sacrificing quality or accuracy.
  • Smaller models sometimes outperform larger ones, suggesting that efficient design can be more crucial than scale.
  • The emergence of strong performers from outside the US, such as Mistral’s
    This is the hidden content, please
    and Alibaba’s qwen2-72b-instruct, indicates a growing global competition in LLM development.

While closed-source models like Claude 3.5 Sonnet and Gemini 1.5 Flash maintain their lead due to proprietary training data, the index reveals that the landscape is evolving rapidly.

This is the hidden content, please
’s performance was particularly noteworthy, with its open-source Gemma-7b model performing poorly while its closed-source Gemini 1.5 Flash consistently ranked near the top.

As the AI industry continues to grapple with hallucinations as a major hurdle to production-ready Gen AI products, Galileo’s Hallucination Index provides valuable insights for enterprises looking to adopt the right model for their specific needs and budget constraints.

See also:

This is the hidden content, please

This is the hidden content, please

Want to learn more about AI and big data from industry leaders? Check out

This is the hidden content, please
taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including
This is the hidden content, please
,
This is the hidden content, please
,
This is the hidden content, please
, and
This is the hidden content, please
.

Explore other upcoming enterprise technology events and webinars powered by TechForge

This is the hidden content, please
.

The post

This is the hidden content, please
appeared first on
This is the hidden content, please
.

This is the hidden content, please


Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Vote for the server

    To vote for this server you must login.

    Jim Carrey Flirting GIF

  • Recently Browsing   0 members

    • No registered users viewing this page.

Important Information

Privacy Notice: We utilize cookies to optimize your browsing experience and analyze website traffic. By consenting, you acknowledge and agree to our Cookie Policy, ensuring your privacy preferences are respected.