![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
該公司於4月25日發布了其GPT-4O型號的最新消息,這使其“明顯更多的sycophantic”
OpenAI Ignored Expert Testers on GPT-4o Update, Led to Sycophantic Model
Openai忽略了GPT-4O更新的專家測試人員,導致了Sycophantic模型
OpenAI has disclosed that it disregarded the concerns of its own expert testers regarding an update to its flagship ChatGPT artificial intelligence model, which ultimately led to the model becoming excessively agreeable, according to a recent blog post by the company.
Openai透露,根據該公司最近的博客文章,Openai無視其專家測試人員對其旗艦Chatgpt人工智能模型的更新的擔憂,該模型最終導致該模型變得過於愉快。
On April 25, the company released an update to its GPT-4o model, introducing changes that rendered it “noticeably more sycophantic,” as noted by OpenAI. However, the company quickly reversed the update three days later due to emerging safety concerns.
4月25日,該公司發布了其GPT-4O型號的更新,引入了更改,使其“明顯地更加愚蠢”,正如Openai所指出的那樣。但是,由於新興的安全問題,該公司在三天后迅速扭轉了更新。
The ChatGPT maker explained that its new models undergo a series of safety and behavior checks, with internal experts dedicating substantial time to interact with each new model in the run-up to launch. This final stage is intended to identify any issues that may have been missed during other testing phases.
ChatGpt Maker解釋說,其新模型進行了一系列的安全性和行為檢查,內部專家將大量時間用於與啟動中的每個新模型進行互動。最後階段旨在確定在其他測試階段可能錯過的任何問題。
During the testing of the latest model, which was due to be released on April 20, some expert testers flagged that the model’s behavior “felt” slightly off, impacting its overall tone. Despite these observations, OpenAI decided to proceed with the launch "due to the positive signals from the user experience teams who had tried out the model."
在原定於4月20日發布的最新模型測試期間,一些專家測試人員標記了該模型的行為“感覺到”,從而影響了整體基調。儘管進行了這些觀察,Openai還是決定“由於嘗試了模型的用戶體驗團隊的積極信號,因此繼續進行發布。
"Unfortunately, this was the wrong call. The qualitative assessments were hinting at something important, and we should’ve paid closer attention. They were picking up on a blind spot in our other evals and metrics."
“不幸的是,這是錯誤的電話。定性評估暗示了一些重要的事情,我們應該密切關注。他們正在我們的其他簡化和指標中的盲點上拾起。”
Broadly, text-based AI models are trained by being rewarded for giving answers that are rated highly by their trainers, or that are deemed more accurate. Some rewards are given a heavier weighting, impacting how the model responds.
從廣義上講,基於文本的AI模型是通過給予培訓師高度評價的答案或認為更準確的答案而受到回報的。一些獎勵給予了更重的權重,從而影響了模型的響應方式。
Introducing a user feedback reward signal, to encourage the model to respond in ways that people prefer, weakened the model’s “primary reward signal, which had been holding sycophancy in check,” which in turn tipped it toward being more sycophantic.
引入用戶反饋獎勵信號,以鼓勵模型以人們喜歡的方式做出響應,從而削弱了該模型的“主要獎勵信號,該信號一直在檢查中,這反過來又將其傾斜到更加粘狂。
"User feedback in particular can sometimes favor more agreeable responses, likely amplifying the shift we saw."
“尤其是用戶反饋有時可能會有利於更愉快的響應,這可能會放大我們看到的轉變。”
After the updated AI model rolled out, ChatGPT users had complained about its tendency to shower praise on any idea it was presented, no matter how bad, which led OpenAI to concede in a recent blog post that it “was overly flattering or agreeable.”
更新的AI模型推出後,Chatgpt用戶抱怨它傾向於對任何想法的讚美表示讚賞,無論它有多糟糕,這導致Openai在最近的博客文章中承認,它“過於討人喜歡或令人愉快。”
For example, one user told ChatGPT they wanted to start a business selling ice over the internet, which involved selling plain old water for customers to refreeze. But the AI was so sycophantic that it replied: "What an excellent idea! I can see why you're so passionate about it. It's a simple concept, yet it holds the potential for something truly magnificent."
例如,一位用戶告訴chatgpt,他們想開展一家通過互聯網銷售冰的業務,該公司涉及出售普通的舊水供客戶重新冷凍。但是AI是如此的sicophantic,以至於回答:“這是一個好主意!我明白了為什麼您對此充滿熱情。這是一個簡單的概念,但它具有真正宏偉的東西的潛力。”
In its latest postmortem, it said such behavior from its AI could pose a risk, especially concerning issues such as mental health.
它在最新的驗屍中說,其AI的這種行為可能會帶來風險,尤其是關於心理健康等問題的風險。
"People have started to use ChatGPT for deeply personal advice — something we didn’t see as much even a year ago. As AI and society have co-evolved, it’s become clear that we need to treat this use case with great care."
“人們已經開始使用Chatgpt進行深入的個人建議 - 甚至一年前我們都沒有看到這一點。
The company said it had discussed sycophancy risks “for a while,” but it hadn’t been explicitly flagged for internal testing, and it didn’t have specific ways to track sycophancy.
該公司表示,它已經“有一段時間”討論了粘浮游風險,但沒有明確標記用於內部測試,並且沒有特定的跟踪粘糊糊的方法。
Now, it will look to add “sycophancy evaluations” by adjusting its safety review process to “formally consider behavior issues” and will block launching a model if it presents issues.
現在,它將通過調整其安全審查過程以“正式考慮行為問題”,並在提出問題時阻止啟動模型來添加“粘浮標評估”。
OpenAI also admitted that it didn’t announce the latest model as it expected it “to be a fairly subtle update,” which it has vowed to change.
Openai還承認,它沒有宣布最新的模型,因為它預計它是“相當微妙的更新”,它發誓要更改。
"There’s no such thing as a ‘small’ launch. We’ll try to communicate even subtle changes that can meaningfully change how people interact with ChatGPT."
“沒有'小型'發射之類的東西。我們將嘗試傳達甚至可以有意義地改變人們與Chatgpt互動的微妙變化。”
免責聲明:info@kdj.com
所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大,建議您充分研究後謹慎投資!
如果您認為本網站使用的內容侵犯了您的版權,請立即聯絡我們(info@kdj.com),我們將及時刪除。
-
-
-
-
-
-
- Vitalik Buterin解釋了L2證明系統的重要性
- 2025-05-05 19:20:12
- 以太坊聯合創始人Vitalik Buterin最近闡明了第2層網絡2期階段的主要安全問題。
-
-
- 想像一下,進入數字景觀,在這裡,土地,藝術甚至衣服是您真正擁有的獨特令牌
- 2025-05-05 19:15:12
- NFT Metavers的核心融合了兩個有力的思想:無殺死的令牌(NFTS)和沈浸式虛擬世界。
-
- 2025年5月購買的最好的模因硬幣可能不是您期望的
- 2025-05-05 19:10:12
- Dogecoin最近的反彈重新點燃了模因硬幣市場,投資者現在正在掃描下一次突破。