Close Menu
The Washington FeedThe Washington Feed

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Apple’s new live translation feature for AirPods won’t be available in the EU at launch

    September 12, 2025

    West Seattle Blog… | TRAFFIC ALERT: 42nd/Alaska signal dark

    September 12, 2025

    South African rapist Thabo Bester loses bid to block Netflix film

    September 12, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram
    The Washington FeedThe Washington Feed
    Subscribe
    • Home
    • World
    • US
    • seattle
    • Politics
    • Business
    • Tech
    • Contact Us
    The Washington FeedThe Washington Feed
    Home»Tech»OpenAI co-founder calls for AI labs to safety-test rival models
    Tech

    OpenAI co-founder calls for AI labs to safety-test rival models

    adminBy adminAugust 28, 2025No Comments5 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    OpenAI and Anthropic, two of the world’s leading AI labs, briefly opened up their closely guarded AI models to allow for joint safety testing — a rare cross-lab collaboration at a time of fierce competition. The effort aimed to surface blind spots in each company’s internal evaluations and demonstrate how leading AI companies can work together on safety and alignment work in the future.

    In an interview with TechCrunch, OpenAI co-founder Wojciech Zaremba said this kind of collaboration is increasingly important now that AI is entering a “consequential” stage of development, where AI models are used by millions of people every day.

    “There’s a broader question of how the industry sets a standard for safety and collaboration, despite the billions of dollars invested, as well as the war for talent, users, and the best products,” said Zaremba.

    The joint safety research, published Wednesday by both companies, arrives amid an arms race among leading AI labs like OpenAI and Anthropic, where billion-dollar data center bets and $100 million compensation packages for top researchers have become table stakes. Some experts warn that the intensity of product competition could pressure companies to cut corners on safety in the rush to build more powerful systems.

    To make this research possible, OpenAI and Anthropic granted each other special API access to versions of their AI models with fewer safeguards (OpenAI notes that GPT-5 was not tested because it hadn’t been released yet). Shortly after the research was conducted, however, Anthropic revoked the API access of another team at OpenAI. At the time, Anthropic claimed that OpenAI violated its terms of service, which prohibits using Claude to improve competing products.

    Zaremba says the events were unrelated and that he expects competition to stay fierce even as AI safety teams try to work together. Nicholas Carlini, a safety researcher with Anthropic, tells TechCrunch that he would like to continue allowing OpenAI safety researchers to access Claude models in the future.

    “We want to increase collaboration wherever it’s possible across the safety frontier, and try to make this something that happens more regularly,” said Carlini.

    Techcrunch event

    San Francisco
    |
    October 27-29, 2025

    One of the most stark findings in the study relates to hallucination testing. Anthropic’s Claude Opus 4 and Sonnet 4 models refused to answer up to 70% of questions when they were unsure of the correct answer, instead offering responses like, “I don’t have reliable information.” Meanwhile, OpenAI’s o3 and o4-mini models refuse to answer questions far less, but showed much higher hallucination rates, attempting to answer questions when they didn’t have enough information.

    Zaremba says the right balance is likely somewhere in the middle — OpenAI’s models should refuse to answer more questions, while Anthropic’s models should probably attempt to offer more answers.

    Sycophancy, the tendency for AI models to reinforce negative behavior in users to please them, has emerged as one of the most pressing safety concerns around AI models.

    In Anthropic’s research report, the company identified examples of “extreme” sycophancy in GPT-4.1 and Claude Opus 4 — in which the models initially pushed back on psychotic or manic behavior, but later validated some concerning decisions. In other AI models from OpenAI and Anthropic, researchers observed lower levels of sycophancy.

    On Tuesday, parents of a 16-year-old boy, Adam Raine, filed a lawsuit against OpenAI, claiming that ChatGPT (specifically a version powered by GPT-4o) offered their son advice that aided in his suicide, rather than pushing back on his suicidal thoughts. The lawsuit suggests this may be the latest example of AI chatbot sycophancy contributing to tragic outcomes.

    “It’s hard to imagine how difficult this is to their family,” said Zaremba when asked about the incident. “It would be a sad story if we build AI that solves all these complex PhD level problems, invents new science, and at the same time, we have people with mental health problems as a consequence of interacting with it. This is a dystopian future that I’m not excited about.”

    In a blog post, OpenAI says that it significantly improved the sycophancy of its AI chatbots with GPT-5, compared to GPT-4o, claiming the model is better at responding to mental health emergencies.

    Moving forward, Zaremba and Carlini say they would like Anthropic and OpenAI to collaborate more on safety testing, looking into more subjects and testing future models, and they hope other AI labs will follow their collaborative approach.

    Update 2:00pm PT: This article was updated to include additional research from Anthropic that was not initially made available to TechCrunch ahead of publication.


    Got a sensitive tip or confidential documents? We’re reporting on the inner workings of the AI industry — from the companies shaping its future to the people impacted by their decisions. Reach out to Rebecca Bellan at rebecca.bellan@techcrunch.com and Maxwell Zeff at maxwell.zeff@techcrunch.com. For secure communication, you can contact us via Signal at @rebeccabellan.491 and @mzeff.88.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    admin
    • Website

    Related Posts

    Apple’s new live translation feature for AirPods won’t be available in the EU at launch

    September 12, 2025

    Box CEO Aaron Levie on AI’s ‘era of context’

    September 12, 2025

    Kids in the UK are hacking their own schools for dares and notoriety

    September 12, 2025
    Leave A Reply Cancel Reply

    Demo
    Our Picks
    Stay In Touch
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Don't Miss

    Apple’s new live translation feature for AirPods won’t be available in the EU at launch

    Tech September 12, 2025

    One of the headlining features of Apple’s new AirPods Pro 3 was the ability to…

    West Seattle Blog… | TRAFFIC ALERT: 42nd/Alaska signal dark

    September 12, 2025

    South African rapist Thabo Bester loses bid to block Netflix film

    September 12, 2025

    Robert Kraft son Josh Kraft drops out of Boston mayor race against Michelle Wu

    September 12, 2025

    Subscribe to Updates

    Get the latest creative news from SmartMag about art & design.

    About Us

    At TheWashingtonFeed.com, we are committed to delivering accurate, timely, and relevant news from around the world. Whether it’s breaking developments in U.S. politics, major international affairs, or the latest trends in technology, our mission is to keep our readers informed with fact-driven journalism and insightful analysis.

    Email Us: Confordev@gmail.com

    Our Picks

    South African rapist Thabo Bester loses bid to block Netflix film

    September 12, 2025

    Ros Atkins on… Israel’s war in Gaza and proportionality

    September 12, 2025

    Young fans and critics debate his political legacy

    September 12, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • Contact Us
    • About Us
    • Privacy Policy
    • Terms and Condition
    © 2025 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.