SecureITWorld (1)
Sign Up

We'll call you!

One of our agents will call you. Please enter your number below

JOIN US



Subscribe to our newsletter and receive notifications for FREE !




    By completing and submitting this form, you understand and agree to SecureITWorld processing your acquired contact information as described in our Privacy policy. You can also update your email preference or unsubscribe at any time.

    SecureITWorld (1)
    Sign Up

    JOIN US



    Subscribe to our newsletter and receive notifications for FREE !




      By completing and submitting this form, you understand and agree to SecureITWorld processing your acquired contact information as described in our Privacy policy. You can also update your email preference or unsubscribe at any time.

      OpenAI o3 Model Performance Disparity: Third-party Tracker Shows Lower Scores

      '}}

      OpenAI o3 model reportedly fails to achieve the set benchmark initially proclaimed by the developing company. As per a recent report of Epoch AI, the model could reach only 10% of the score in FrontierMath problems, which was primarily claimed to be 25%. 

      Chaos takes place among the users and developers as the latest AI model by OpenAI is facing difficulties matching the benchmark scores. As a result, questions arise about the firmโ€™s transparency and model development practices. 

      OpenAI launched the o3 AI model on April 16, 2025, alongside the o4-mini, integrating unmatched reasoning and multimodal understanding. As per the developing firm, o3 is their โ€œmost powerful reasoning model.โ€ Alongside that, the new models are set to generate results on ChatGPT using Python code execution, web browsing, and image processing. 

      The development of the o3 model was announced in December 2024 when OpenAI claimed that the advancement would be able to solve increasingly difficult mathematical problems of FrontierMath. As per the assertion, the model could solve more than a fourth of the questions from the set of problems, reaching 25% of the score. Evidently, all the available models could only achieve 2% in the evaluation. 

      However, Epoch AI, a reputed research body, has found the OpenAI o3 model to score only around 10% in their benchmark assessment. The firm remarked, โ€œOpenAI has released o3, their highly anticipated reasoning model, along with o4-mini, a smaller and cheaper model that succeeds o3-mini. We evaluated the new models on our suite of math and science benchmarks. Results in thread!โ€ 

      Epoch AI - OpenAI Report

      What Concerns Does this Matter Raise? 

      The dissimilarity between OpenAIโ€™s proclamation and Epoch AIโ€™s recent test results has triggered transparency issues between the company and its user base. Many believe it to be a mere and unnecessary PR attempt that OpenAI previously committed during the launch of GPT-2. Epoch AIโ€™s test results clearly show that the o3 model couldnโ€™t match the desired and predefined outcome. 

      The situation also indicates issues with the regulatory frameworks of AI development, which question the accountability of the AI models. Such incidents impact user trust remarkably. As a consequence, individuals may not find such AI models reliable enough. Nevertheless, Epoch AI believes that its testing environment might vary significantly from that of OpenAIโ€™s testing parameters. So, OpenAI may not have made false assertions on the performance of their latest AI model.  

      Attain knowledge of the latest technologies and trends with SecureITWorld!ย 


      Also Read:

      What is ChatGPT, and What are its Key Benefits?
      What is Perplexity AI Model? Is Perplexity AI Better than ChatGPT?




        By completing and submitting this form, you understand and agree to SecureITWorld processing your acquired contact information as described in our Privacy policy. You can also update your email preference or unsubscribe at any time.

        Popular Picks


        Recent Blogs

        Recent Articles

        SecureITWorld (1)

        Contact Us

        For General Inquiries and Information:

        For Advertising and Partnerships: 


        Copyright ยฉ 2025 SecureITWorld . All rights reserved.

        Scroll to Top