OpenAI says ChatGPT matches human experts in 44 professions

OpenAI on Tuesday unveiled bold claims that its ChatGPT artificial intelligence can rival human experts across 44 professions.
Tech - ChatGPT
Published on

WASHINGTON : The announcement was backed by OpenAI’s newly developed GDPval benchmark, its most ambitious attempt yet to measure AI’s real-world economic value. The test spans 1,320 authentic work tasks across nine major industries that drive U.S. GDP, with evaluations conducted by professionals averaging 14 years of experience. Occupations tested included software development, law, nursing, financial advising, and social work.

According to the results, Anthropic’s Claude Opus 4.1 outperformed OpenAI’s own GPT-5 high, achieving a 47.6% win and tie rate against human professionals, while GPT-5 scored 38.8%. Still, the findings mark dramatic progress: GPT-4o, released in spring 2024, had scored only 13.7%. OpenAI researchers claim that their models can complete such tasks 100 times faster and more cost-effectively than human experts.

Alongside the benchmark, OpenAI rolled out “Instant Checkout,” a feature that enables U.S. ChatGPT users to purchase items directly within conversations. Initially supporting single-item Etsy purchases, the tool will soon expand to over one million Shopify merchants, including Glossier, Skims, and Spanx. The commerce system runs on OpenAI’s Agentic Commerce Protocol, developed with Stripe and open-sourced for broader adoption.

The developments land amid fierce competition in the AI sector. Anthropic recently announced its Claude Sonnet 4.5 model, capable of running autonomously for up to 30 hours on complex coding tasks, while Microsoft is pushing its Copilot Merchant Program to embed shopping inside its own AI services.

Still, experts caution that AI’s role in the workplace remains limited. The GDPval report itself notes that “most jobs are more than just a collection of tasks that can be written down,” pointing out that real-world work rarely comes with neatly defined prompts.

Meanwhile, a recent MIT study found that fewer than one in ten AI pilot projects delivered measurable revenue gains, while Harvard and Stanford researchers warned of “workslop” content generated by AI that lacks real substance.

Summary

OpenAI's GDPval benchmark shows ChatGPT's parity with human experts in 44 professions, indicating AI's growing role in the economy. Despite Anthropic's Claude Opus 4.1 leading over GPT-5, OpenAI's models are efficient and cost-effective. The 'Instant Checkout' feature expands ChatGPT's capabilities, as the AI sector faces fierce competition and evolving applications.

Business Plus Review
www.businessplusreview.com