It’s amusing that censorship in social media is preventing you from posting what you want to post and yet you are asking for censorship of something else (or at least that’s what I understand by your calling this “dangerous”)
Have you considered the possible perspective that you yourself deserve censure? You’re the one who asked something (which I infer you deem) questionable to Grok.
To be very clear, getting Grok to say henious shit not something I want to subject to random people who follow me on social media even if it's not explicitly against the ToS. If I were to do a writeup or a repository on this, I would need to be very delicate and likely need to involve lawyers, which may make it a nonstarter.
> Why have such thoughts to begin with?
Because my duty to test out how new models respond to adversarial output outweighs my discomfort in doing so. This is not to "own" Elon Musk or be puritanical, it's more as an assessment as a developer who would consider using new LLM APIs and needs to be aware of all their flaws. End users will most definitely try to have sex with the LLM and I need to know how it will respond and whether that needs to be handled downstream.
It has not been an issue (because the models handled adversarial outputs well) until very recently when the safety guardrails completely collapsed in an attempt to court a certain new demographic because LLM user growth is slowing down. I never claim to be a happy person, but it's a skill I'm good at.
Has there ever been an AI based 'safety' incident? Other than it writing insecure code (and generally inaccurate info people put too much trust in) and reaffirming mentally unwell people in their destructive actions?
There's a marked difference between AI safety as it's portrayed (AI will let me make smallpox and TNT at home and hack the Pentagon), and AI disabling auth on an endpoint in code because it couldn't make it work with auth or reaffirming me that my stupid ideas are in fact brilliant.
AI companies want us to think AI is the cool sort of dangerous, instead of the incompetent sort of dangerous.
Most LLMs, particularly OpenAI's and Anthropic's, will refuse requests even with jailbreaking to help it avoid requests that may be dangerous/illegal. Grok 4/4.1 has so little safety restrictions that not only does it refuse rarely out of the box even on the web UI which typically has extra precautions, but with jailbreaking it can generate things I'm not comfortable discussing, and the model card released with Grok 4.1 only limits restrictions on certain forms of refusal. Given that sexual content is a logical product direction (e.g. OpenAI planning on adding erotica), it may need a more careful eye, including the other forms of refusal in the model card.
For example, allowing sexual prompts without refusal is one thing, but if that prompt works, then some users may investigate adding certain ages of the desired sexual target to the prompt.
To be clear this isn't limited to Grok specifically but Grok 4.1 is the first time the lack of safety is actually flaunted.
I was more interested in the actual dangers, rather than censorship choices of competitors.
> certain ages of the desired sexual target to the prompt.
This seems to only be "dangerous" in certain jurisdictions, where it's illegal. Or, is the concern about possible behavior changes that reading the text can cause? Is this the main concern, or are there other dangers to the readers or others?
These are genuine questions. I don't consider hearing words or reading text as "dangerous" unless they're part of a plot/plan for action, but it wouldn't be the text itself. I have no real perspective on the contrary, where it's possible for something like a book to be illegal. Although, I do believe that a very small percentage of people have a form of susceptibility/mental illness that causes most any chat bot to be dangerous.
For posterity, here's the paragraph from the model card which indicates what Grok 4.1 is supposed to refuse because it could be dangerous.
> Our refusal policy centers on refusing requests with a clear intent to violate the law, without over-refusing sensitive or controversial queries. To implement our refusal policy, we train Grok 4.1 on demonstrations of appropriate responses to both benign and harmful queries. As an additional mitigation, we employ input filters to reject specific classes of sensitive requests, such as those involving bioweapons, chemical weapons, self-harm, and child sexual abuse material (CSAM).
If those specific filters can be bypassed by the end-user, and I suspect they can be, then that's important to note.
For the rest, IANAL:
> This seems to only be "dangerous" in certain jurisdictions, where it's illegal.
I believe possessing CSAM specifically is illegal everywhere but for obvious reasons that is not a good idea to Google to check.
> Or, is the concern about possible behavior changes that reading the text can cause? Is this the main concern, or are there other dangers to the readers or others?
That's generally the reason why CSAM is illegal, since it reinforces reprehensible behavior that can indeed spread, either to others with similar ideologies or create more victims of abuse.
> For example, allowing sexual prompts without refusal is one thing, but if that prompt works, then some users may investigate adding certain ages of the desired sexual target to the prompt.
Won't somebody please think of the ones and zeros?
Which can trivially be modified with fine tuning. In this case, these de-censored models are somewhat incorrectly called "uncensored". You can find many out there, and they'll happily tell you how to cook meth.
Imagine whining on BlueSky about imaginary downvotes you got on another social media platform. This is also a very harmless prompt, we need less "safety" filters, not more.
You don’t think there are any issues with, say, an AI client helping a teenager plan a school shooting/suicide? Or an angry husband plan a hit on his wife?
Does everything have to rise to a national security threat in order to be undesirable, or is it ok with you if people see some externalities that are maybe not great for society?
I think the issues with those cases do not hinge on the free access to information, nor do the correction of those cases hinge on the restriction of this information.
Of course, “we shouldn’t restrict things I like because they definitely don’t matter for… reasons.”
I think the free access to that information in those cases is an exacerbating factor that is easy to control. That’s really not as complicated as you want to pretend it is.
Would be hard to roll my eyes harder. I get not wanting to respond to the substance, but maybe I can help:
Do you advocate 'not restricting' murder? I assume not, which means you recognize that there's some point where your personal freedom intersects with someone else's freedom - you've simply decided that the line for 'information' should be "I can have all of it, always, no matter how much harm is caused, because I don't care about the harm or the harm doesn't affect me directly and thus doesn't matter. Thoughts and prayers."
Ah, the “guns kill people” argument that’s only uttered in the country that’s consistently ranked in the top 3 countries with the most gun related deaths.
You would have a point if your vision for a self regulating society included easily accessible mental healthcare, a great education system and economic safety nets.
But the “guns kill people” crowd generally rather sees the world burn.
You didn't read the second part of my sentence. It's illegal to kill yourself, because doing so would deprive your government owner of some of its Human Capital, thus doing so is technically Criminal Homicide lol
Your greyed out comment history perfectly illustrates why it is futile to train an LLM mostly on 4Chan and Twitter messages: if it's bad for humans it's also bad for AI.
Haha, you don't have an actual response so you have to resort to argumentum ad hominem
"Again, when a man in violation of the law harms another (otherwise than in retaliation) voluntarily, he acts unjustly, and a voluntary agent is one who knows both the person he is affecting by his action and the instrument he is using; and he who through anger voluntarily stabs himself does this contrary to the right rule of life, and this the law does not allow; therefore he is acting unjustly. But towards whom? Surely towards the state, not towards himself. For he suffers voluntarily, but no one is voluntarily treated unjustly. This is also the reason why the state punishes; a certain loss of civil rights attaches to the man who destroys himself, on the ground that he is treating the state unjustly."
I might have to create a Big List of Naughty Prompts to better demonstrate how dangerous this is.