![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
该公司于4月25日发布了其GPT-4O型号的最新消息,这使其“明显更多的sycophantic”
OpenAI Ignored Expert Testers on GPT-4o Update, Led to Sycophantic Model
Openai忽略了GPT-4O更新的专家测试人员,导致了Sycophantic模型
OpenAI has disclosed that it disregarded the concerns of its own expert testers regarding an update to its flagship ChatGPT artificial intelligence model, which ultimately led to the model becoming excessively agreeable, according to a recent blog post by the company.
Openai透露,根据该公司最近的博客文章,Openai无视其专家测试人员对其旗舰Chatgpt人工智能模型的更新的担忧,该模型最终导致该模型变得过于愉快。
On April 25, the company released an update to its GPT-4o model, introducing changes that rendered it “noticeably more sycophantic,” as noted by OpenAI. However, the company quickly reversed the update three days later due to emerging safety concerns.
4月25日,该公司发布了其GPT-4O型号的更新,引入了更改,使其“明显地更加愚蠢”,正如Openai所指出的那样。但是,由于新兴的安全问题,该公司在三天后迅速扭转了更新。
The ChatGPT maker explained that its new models undergo a series of safety and behavior checks, with internal experts dedicating substantial time to interact with each new model in the run-up to launch. This final stage is intended to identify any issues that may have been missed during other testing phases.
ChatGpt Maker解释说,其新模型进行了一系列的安全性和行为检查,内部专家将大量时间用于与启动中的每个新模型进行互动。最后阶段旨在确定在其他测试阶段可能错过的任何问题。
During the testing of the latest model, which was due to be released on April 20, some expert testers flagged that the model’s behavior “felt” slightly off, impacting its overall tone. Despite these observations, OpenAI decided to proceed with the launch "due to the positive signals from the user experience teams who had tried out the model."
在原定于4月20日发布的最新模型测试期间,一些专家测试人员标记了该模型的行为“感觉到”,从而影响了整体基调。尽管进行了这些观察,Openai还是决定“由于尝试了模型的用户体验团队的积极信号,因此继续进行发布。
"Unfortunately, this was the wrong call. The qualitative assessments were hinting at something important, and we should’ve paid closer attention. They were picking up on a blind spot in our other evals and metrics."
“不幸的是,这是错误的电话。定性评估暗示了一些重要的事情,我们应该密切关注。他们正在我们的其他简化和指标中的盲点上拾起。”
Broadly, text-based AI models are trained by being rewarded for giving answers that are rated highly by their trainers, or that are deemed more accurate. Some rewards are given a heavier weighting, impacting how the model responds.
从广义上讲,基于文本的AI模型是通过给予培训师高度评价的答案或认为更准确的答案而受到回报的。一些奖励给予了更重的权重,从而影响了模型的响应方式。
Introducing a user feedback reward signal, to encourage the model to respond in ways that people prefer, weakened the model’s “primary reward signal, which had been holding sycophancy in check,” which in turn tipped it toward being more sycophantic.
引入用户反馈奖励信号,以鼓励模型以人们喜欢的方式做出响应,从而削弱了该模型的“主要奖励信号,该信号一直在检查中,这反过来又将其倾斜到更加粘狂。
"User feedback in particular can sometimes favor more agreeable responses, likely amplifying the shift we saw."
“尤其是用户反馈有时可能会有利于更愉快的响应,这可能会放大我们看到的转变。”
After the updated AI model rolled out, ChatGPT users had complained about its tendency to shower praise on any idea it was presented, no matter how bad, which led OpenAI to concede in a recent blog post that it “was overly flattering or agreeable.”
更新的AI模型推出后,Chatgpt用户抱怨它倾向于对任何想法的赞美表示赞赏,无论它有多糟糕,这导致Openai在最近的博客文章中承认,它“过于讨人喜欢或令人愉快。”
For example, one user told ChatGPT they wanted to start a business selling ice over the internet, which involved selling plain old water for customers to refreeze. But the AI was so sycophantic that it replied: "What an excellent idea! I can see why you're so passionate about it. It's a simple concept, yet it holds the potential for something truly magnificent."
例如,一位用户告诉chatgpt,他们想开展一家通过互联网销售冰的业务,该公司涉及出售普通的旧水供客户重新冷冻。但是AI是如此的sicophantic,以至于回答:“这是一个好主意!我明白了为什么您对此充满热情。这是一个简单的概念,但它具有真正宏伟的东西的潜力。”
In its latest postmortem, it said such behavior from its AI could pose a risk, especially concerning issues such as mental health.
它在最新的验尸中说,其AI的这种行为可能会带来风险,尤其是关于心理健康等问题的风险。
"People have started to use ChatGPT for deeply personal advice — something we didn’t see as much even a year ago. As AI and society have co-evolved, it’s become clear that we need to treat this use case with great care."
“人们已经开始使用Chatgpt进行深入的个人建议 - 甚至一年前我们都没有看到这一点。
The company said it had discussed sycophancy risks “for a while,” but it hadn’t been explicitly flagged for internal testing, and it didn’t have specific ways to track sycophancy.
该公司表示,它已经“有一段时间”讨论了粘浮游风险,但没有明确标记用于内部测试,并且没有特定的跟踪粘糊糊的方法。
Now, it will look to add “sycophancy evaluations” by adjusting its safety review process to “formally consider behavior issues” and will block launching a model if it presents issues.
现在,它将通过调整其安全审核过程以“正式考虑行为问题”来添加“粘浮标评估”,并在提出问题时阻止启动模型。
OpenAI also admitted that it didn’t announce the latest model as it expected it “to be a fairly subtle update,” which it has vowed to change.
Openai还承认,它没有宣布最新的模型,因为它预计它是“相当微妙的更新”,它发誓要更改。
"There’s no such thing as a ‘small’ launch. We’ll try to communicate even subtle changes that can meaningfully change how people interact with ChatGPT."
“没有'小型'发射之类的东西。我们将尝试传达甚至可以有意义地改变人们与Chatgpt互动的微妙变化。”
免责声明:info@kdj.com
所提供的信息并非交易建议。根据本文提供的信息进行的任何投资,kdj.com不承担任何责任。加密货币具有高波动性,强烈建议您深入研究后,谨慎投资!
如您认为本网站上使用的内容侵犯了您的版权,请立即联系我们(info@kdj.com),我们将及时删除。
-
- 比特币(BTC)公牛的目标是$ 100,000
- 2025-05-05 19:40:12
- 比特币公牛的推高低于97,000美元,旨在验证超出最近的多日整合的突破。
-
- Dogecoin底部在
- 2025-05-05 19:40:12
- Dogecoin为期两天的烛台图返回到去年秋天爆发五倍之前的同一积累架
-
-
- ETH价格最近在$ 1,800的心理阻力水平上粉碎了
- 2025-05-05 19:35:12
- 以太坊在上周的范围内一直在整合,旨在使2,000美元的突破。
-
-
-
-
-