Elon Musk’s xAI unleashes Grok 3: The smartest AI on Earth?

By willowt // 2025-02-19

Mastodon

Parler

Gab

Copy

Elon Musk's xAI has unveiled Grok 3, claiming it as the "smartest AI on Earth." Developed in just over a year, it has outperformed leading AI models from OpenAI, Google and DeepSeek.
Grok 3 boasts more than 10 times the computing power of its predecessor, Grok 2. It achieved a record-breaking score of 1400 on the LLM blind test by LMArena, surpassing models like Google's Gemini, Anthropic's Claude and OpenAI's GPT-4.
Detailed reviews by AI experts highlight Grok 3's strengths, including advanced thinking capabilities, knowledge retrieval and tackling complex problems like the Riemann Hypothesis. However, it struggles with generating complex SVG images and lacks significant improvement in humor.
xAI's launch of Grok 3 intensifies competition with OpenAI, with Musk criticizing OpenAI's shift towards profit-driven motives. xAI aims to secure a $10 billion funding round to position itself as a key player in the AI infrastructure market.
As xAI pushes the boundaries of AI, regulatory scrutiny, investor confidence and real-world adoption will be crucial. Musk emphasizes the importance of continuous innovation and adaptability in the competitive AI landscape.

In a move that could redefine the artificial intelligence landscape, Elon Musk’s xAI revealed Grok 3, touting it as the "smartest AI on Earth." The model, which has been in development for just over a year, has already outperformed leading AI systems from OpenAI, Google and DeepSeek, sparking a new wave of excitement and skepticism in the tech community.

Quantum leap in AI capabilities

Musk, known for his ambitious ventures in space travel, electric vehicles and social media, has set his sights on understanding the universe through AI. During a livestream on his social media platform X, he explained the mission of xAI and Grok: "The mission of xAI and Grok is to understand the universe. We want to answer the biggest questions: Where are the aliens? What's the meaning of life? How does the universe end? To do that, we must rigorously pursue truth." Grok 3 boasts more than 10 times the computing power of its predecessor, Grok 2, and has completed pre-training earlier this year. The AI has been rigorously tested across various benchmarks, including math, science and coding, where it achieved a record-breaking score of 1400 on the LLM blind test by LMArena, surpassing models like Google's Gemini, DeepSeek's V3, Anthropic's Claude and OpenAI's GPT-4o. "And it's still climbing. So we have to keep updating it. It's 1400 and climbing," Musk said, emphasizing the model's ongoing improvements. "We're continually improving the models every day, and literally within 24 hours, you'll see improvements."

Real-world performance and early reviews

Andrej Karpathy, a former director of AI at Tesla and a member of OpenAI's founding team, provided a detailed review of Grok 3, highlighting its strengths and weaknesses. In a post on X, Karpathy summarized his experience:

Thinking capability: Grok 3's advanced thinking model is on par with top OpenAI models, successfully handling complex tasks like creating a Settlers of Catan game webpage. However, it struggled with an emoji mystery involving Unicode variation selectors.
Tic Tac Toe: The model excelled in solving simple Tic Tac Toe puzzles but had trouble with more complex board configurations.
Knowledge retrieval: Grok 3 performed well in knowledge-based questions, such as estimating the computational cost of training GPT-2 without internet searches, a task other models like o1-pro failed.
Riemann hypothesis: Grok 3 showed persistence in attempting to solve the Riemann hypothesis, demonstrating an initiative to tackle challenging problems.
DeepSearch: This feature combines research capabilities with thinking, providing high-quality responses but occasionally hallucinating non-existent URLs.
Humor: Grok 3's humor capability did not show significant improvement, a common challenge for large language models (LLMs).
Ethical sensitivity: The model was overly cautious with complex ethical issues, avoiding questions that might involve ethical dilemmas.
SVG generation: Grok 3 struggled with generating an SVG of a pelican riding a bicycle, a test of spatial layout abilities, though it performed better than some models but not as well as Claude.

Karpathy concluded that Grok 3 is around the state-of-the-art level, slightly outperforming models like DeepSeek-R1 and Gemini 2.0 Flash Thinking, given that xAI started from scratch about a year ago—an unprecedented achievement.

Broader implications and future of AI

Musk’s launch of Grok 3 comes at a time when the AI industry is intensely competitive. OpenAI, which Musk co-founded in 2015 but left in 2018, has been a dominant force in the AI space. However, the relationship between Musk and OpenAI has become increasingly contentious. Musk has criticized OpenAI for its shift toward profit-driven motives and even attempted a $97.4 billion bid to acquire the company’s nonprofit arm, an offer swiftly rejected by OpenAI CEO Sam Altman. "Bloomberg noted that the new chatbot appears to put Grok ahead of OpenAI's latest ChatGPT and ramps up an increasingly bitter rivalry between the two companies." Musk’s xAI is also seeking to secure a $10 billion funding round, aiming to position itself as a key player in the AI infrastructure market. The company's "colossus supercomputer" in Memphis, Tennessee, powered by a cluster of 100,000 advanced Nvidia GPUs, underscores xAI's commitment to computational power and innovation. However, the path forward is not without challenges. Regulatory scrutiny, investor confidence and real-world adoption will be crucial factors in determining whether Grok 3 can truly disrupt the AI market. Musk’s assertion that "All you need to know to understand which company will win a technology competition is look at the first and second derivatives of the rate of innovation" highlights the importance of continuous improvement and adaptability in the fast-paced AI landscape. As xAI pushes the boundaries of AI with Grok 3, the tech community watches with anticipation to see if this ambitious model can live up to its lofty claims and carve out a significant niche in the highly competitive AI arena. Sources include: ZeroHedge.com TheInformation.com CBSNews.com MNS.com

Mastodon

Parler

Gab

Copy

Tagged Under:

CITIZENS NEWS

Quantum leap in AI capabilities

Real-world performance and early reviews

Broader implications and future of AI