Jump to content
  • Sign Up
×
×
  • Create New...

[AI]OpenAI unveils open-weight AI safety models for developers


Recommended Posts

  • Diamond Member

This is the hidden content, please
is putting more safety controls directly into the hands of AI developers with a new research preview of “safeguard” models. The new ‘gpt-oss-safeguard’ family of open-weight models is aimed squarely at customising content classification.

The new offering will include two models, gpt-oss-safeguard-120b and a smaller gpt-oss-safeguard-20b. Both are fine-tuned versions of the existing gpt-oss family and will be available under the permissive Apache 2.0 license. This will allow any organisation to freely use, tweak, and deploy the models as they see fit.

The real difference here isn’t just the open license; it’s the method. Rather than relying on a fixed set of rules baked into the model, gpt-oss-safeguard uses its reasoning capabilities to interpret a developer’s own policy at the point of inference. This means AI developers using OpenAI’s new model can set up their own specific safety framework to classify anything from single user prompts to full chat histories. The developer, not the model provider, has the final say on the ruleset and can tailor it to their specific use case.

This approach has a couple of clear advantages:

  1. Transparency: The models use a chain-of-thought process, so a developer can actually look under the bonnet and see the model’s logic for a classification. That’s a huge step up from the typical “****** box” classifier.
  1. Agility: Because the safety policy isn’t permanently trained into OpenAI’s new model, developers can iterate and revise their guidelines on the fly without needing a complete retraining cycle. OpenAI, which originally built this system for its internal teams, notes this is a far more flexible way to handle safety than training a traditional classifier to indirectly guess what a policy implies.

Rather than relying on a one-size-fits-all safety layer from a platform holder, developers using open-source AI models can now build and enforce their own specific standards.

While not live as of writing, developers will be able to access OpenAI’s new open-weight AI safety models on the Hugging Face platform.

See also:

This is the hidden content, please

This is the hidden content, please

Want to learn more about AI and big data from industry leaders? Check out

This is the hidden content, please
taking place in Amsterdam, California, and London. The comprehensive event is part of
This is the hidden content, please
and is co-located with other leading technology events including the
This is the hidden content, please
, click
This is the hidden content, please
for more information.

AI News is powered by

This is the hidden content, please
. Explore other upcoming enterprise technology events and webinars
This is the hidden content, please
.

The post

This is the hidden content, please
appeared first on
This is the hidden content, please
.

This is the hidden content, please

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Vote for the server

    To vote for this server you must login.

    Jim Carrey Flirting GIF

  • Recently Browsing   0 members

    • No registered users viewing this page.

Important Information

Privacy Notice: We utilize cookies to optimize your browsing experience and analyze website traffic. By consenting, you acknowledge and agree to our Cookie Policy, ensuring your privacy preferences are respected.