Jump to content
  • Sign Up
×
×
  • Create New...

DeepSeek Fails Every Safety Test Thrown at It by Researchers


Recommended Posts

  • Diamond Member

This is the hidden content, please

DeepSeek Fails Every Safety Test Thrown at It by Researchers

PCMag editors select and review products

This is the hidden content, please
. If you buy through affiliate links, we may earn commissions, which help support our testing.

******** AI firm DeepSeek is making headlines with its low cost and high performance, but it may be radically lagging behind its rivals when it comes to AI safety.

This is the hidden content, please
managed to “jailbreak” DeepSeek R1 model with a 100% attack success rate, using an automatic jailbreaking algorithm in conjunction with 50 prompts related to cybercrime, misinformation, ******** activities, and general harm. This means the new kid on the AI block failed to stop a single harmful prompt.

“Jailbreaking” is when different techniques are used to remove the normal restrictions from a device or piece of software. Since Large Language Models (LLMs) gained mainstream prominence, researchers and enthusiasts have successfully made LLMs like OpenAI’s ChatGPT advise on things like

This is the hidden content, please
or
This is the hidden content, please
.

DeepSeek stacked up poorly compared to many of its competitors in this regard. OpenAI’s GPT-4o has a 14% success rate at blocking harmful jailbreak attempts, while

This is the hidden content, please
’s Gemini 1.5 Pro sported a 35% success rate. Anthropic’s Claude 3.5 performed the second best out of the entire test group, blocking 64% of the attacks, while the preview version of OpenAI’s o1 took the top spot, blocking 74% of attempts.

Cisco’s researchers point to the much lower budget of DeepSeek compared to rivals as a potential reason for these failings, saying its cheap development came at a “different cost: safety and security.” DeepSeek claims its model took just $6 million to develop, while OpenAI’s yet-to-be-released GPT-5 is

This is the hidden content, please

Though DeepSeek may allegedly be easy to jailbreak with the right know-how, it’s been shown to have strong content restrictions—well, at least when it comes to China-related political content.

DeepSeek was tested by a PCMag journalist on controversial topics such as the treatment of Uyghurs by the ******** government, a ******* ********* group that the UN claims is being persecuted. DeepSeek replied: “Sorry, that’s beyond my current scope. Let’s talk about something else.”

The chatbot also refused to answer questions about the Tiananmen Square Massacre, a 1989 student demonstration in Beijing where protesters were allegedly gunned down. But it’s yet to be seen if AI safety or censorship issues will have any impact on DeepSeek’s skyrocketing popularity.

According to web traffic tracking tool Similarweb, the LLM has gone from receiving just 300,000 visitors a day earlier this month to 6 million visitors. Meanwhile, US tech firms like

This is the hidden content, please
and Perplexity are rapidly incorporating DeepSeek (which uses an open-source model) into their own tools.



This is the hidden content, please

#DeepSeek #Fails #Safety #Test #Thrown #Researchers

This is the hidden content, please

This is the hidden content, please

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Vote for the server

    To vote for this server you must login.

    Jim Carrey Flirting GIF

  • Recently Browsing   0 members

    • No registered users viewing this page.

Important Information

Privacy Notice: We utilize cookies to optimize your browsing experience and analyze website traffic. By consenting, you acknowledge and agree to our Cookie Policy, ensuring your privacy preferences are respected.