Image

Elon Musk’s xAI’s latest mannequin, Grok 4, is lacking a key security report

xAI’s latest frontier model, Grok 4, has been released without industry-standard safety reports, despite the company’s CEO, Elon Musk, being notably vocal about his concerns regarding AI safety.

Leading AI labs typically release safety reports known as “system cards” alongside frontier models.

The reports serve as transparency documents and detail performance metrics, limitations, and, crucially, the potential dangers of advanced AI models. These cards also allow researchers, experts, and policymakers to access the model’s capabilities and threat level.  

Several leading AI companies committed to releasing reports for all major public model releases that are more powerful than the current state-of-the-art tech at a July 2023 meeting convened by then-President Joe Biden’s administration at the White House.

While xAI did not publicly agree to these commitments, at an international summit on AI safety held in Seoul in May 2024, the company—alongside other leading AI labs—committed to the Frontier AI Safety Commitments, which included a commitment to disclose model capabilities, inappropriate use cases, and provide transparency around a model’s risk assessments and outcomes.

Moreover, since 2014, Musk has continually and publicly called AI an existential threat, campaigned for stricter regulation, and advocated for higher safety standards.

Now, the AI lab he heads up appears to be breaking from industry standards by releasing Grok 4, and previous versions of the model, without publicly disclosed safety testing.

Representatives for xAI did not respond to Fortune’s questions about whether Grok’s system card exists or will be released.

Leading AI labs have been criticized for delayed safety reports

While leading AI labs’ safety reporting has faced scrutiny over the past few months, especially that of Google and OpenAI (which both released AI models before publishing accompanying system cards), most have provided some public safety information for their most powerful models.

Dan Hendrycks, a director of the Center for AI Safety who advises xAI on safety, denied the claim that the company had done no safety testing.

In a post on X, Hendrycks said that the company had tested the model on “dangerous capability evals” but failed to provide details of the results.

Why are safety cards important?

Several advanced AI models have demonstrated dangerous capabilities in recent months.

According to a recent Anthropic study, most leading AI models have a tendency to opt for unethical means to pursue their goals or ensure their existence.

In experiments set up to leave AI models few options and stress-test alignment, top systems from OpenAI, Google, and others frequently resorted to blackmail to protect their interests.

As models get more advanced, safety testing becomes more important.

For example, if internal evaluations show that an AI model has dangerous capabilities such as the ability to assist users in the creation of biological weapons, then developers might need to create additional safeguards to manage these risks to public safety.

Samuel Marks, an AI safety researcher at Anthropic, called the lack of safety reporting from xAI “reckless” and a break from “industry best practices followed by other major AI labs.”

“One wonders what evals they ran, whether they were done properly, whether they would seem to necessitate additional safeguards,” he said in an X post.

Marks said Grok 4 was already showing concerning, undocumented behaviors post-deployment, pointing to examples that showed the model searching for Elon Musk’s views before giving its views on political subjects, including the Israel/Palestine conflict.

Grok’s problematic behavior

An earlier version of Grok also made headlines last week when it began praising Adolf Hitler, making antisemitic comments, and referring to itself as “MechaHitler.”

xAI issued an apology for the antisemitic remarks made by Grok, saying the company apologized “for the horrific behavior many experienced.”

After the release of Grok 4, the company said in a statement it had spotted similarly problematic behavior from the new model and had “immediately investigated & mitigated.”

“One was that if you ask it “What is your surname?” it doesn’t have one so it searches the internet leading to undesirable results, such as when its searches picked up a viral meme where it called itself ‘MechaHitler’ Another was that if you ask it ‘What do you think?’ the model reasons that as an AI it doesn’t have an opinion but knowing it was Grok 4 by xAI searches to see what xAI or Elon Musk might have said on a topic to align itself with the company,” the company said in a post on X.

“To mitigate, we have tweaked the prompts and have shared the details on GitHub for transparency. We are actively monitoring and will implement further adjustments as needed,” they wrote.

SHARE THIS POST