![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
這家科技巨頭宣布了對雙子座2.5 Flash的增強功能 - 現在幾乎每個維度都更好,包括推理,代碼和長上下文的基準
Google is moving closer to its goal of a “universal AI assistant” that can understand context, plan and take action.
Google正在更接近其“通用AI助手”的目標,該目標可以理解上下文,計劃和採取行動。
Today at Google I/O, the tech giant announced enhancements to its Gemini 2.5 Flash — it’s now better across nearly every dimension, including benchmarks for reasoning, code and long context — and 2.5 Pro, including an experimental enhanced reasoning mode, ‘Deep Think,’ that allows Pro to consider multiple hypotheses before responding.
今天,在Google I/O上,這家技術巨頭宣布了對其Gemini 2.5 Flash的增強功能 - 現在幾乎在每個維度上都更好,包括推理,代碼和長篇小說的基準,以及2.5 Pro,包括實驗性增強的推理模式,“深層思考”,“深層思考”,這使Pro在響應之前可以考慮多個假設。
“This is our ultimate goal for the Gemini app: An AI that’s personal, proactive and powerful,” Demis Hassabis, CEO of Google DeepMind, said in a press pre-brief.
Google DeepMind首席執行官Demis Hassabis在新聞發布會上說:“這是Gemini App:一個個人,積極主動且有力的AI的最終目標:AI。”
‘Deep Think’ scores impressively on top benchmarks
“深思熟慮”在頂級基准上得分令人印象深刻
Google announced Gemini 2.5 Pro — what it considers its most intelligent model yet, with a one-million-token context window — in March, and released its “I/O” coding edition earlier this month (with Hassabis calling it “the best coding model we’ve ever built!”).
Google宣布了Gemini 2.5 Pro(它認為其迄今為止最聰明的模型,具有100萬台上下文窗口),並於3月發布,並於本月初發布了其“ I/O”編碼版(Hassabis稱其為“我們有史以來最好的編碼模型!”)。
“We’ve been really impressed by what people have created, from turning sketches into interactive apps to simulating entire cities,” said Hassabis.
Hassabis說:“從將草圖變成交互式應用程序到模擬整個城市,人們對我們的創造給我們留下了深刻的印象。”
He noted that, based on Google’s experience with AlphaGo, AI model responses improve when they’re given more time to think. This led DeepMind scientists to develop Deep Think, which uses Google’s latest cutting-edge research in thinking and reasoning, including parallel techniques.
他指出,根據Google在Alphago的經驗,AI模型的響應在有更多時間思考時會有所改善。這導致了深媒體科學家發展深思熟慮,該思想利用了Google在思維和推理方面的最新尖端研究,包括並行技術。
Deep Think has shown impressive scores on the hardest math and coding benchmarks, including the 2025 USA Mathematical Olympiad (USAMO). It also leads on LiveCodeBench, a difficult benchmark for competition-level coding, and scores 84.0% on MMMU, which tests multimodal understanding and reasoning.
Deep Think在最難的數學和編碼基准上表現出令人印象深刻的分數,包括2025年美國數學奧林匹克(USAMO)。它還領導著LiveCodebench,這是競爭級編碼的困難基準,在MMMU上得分為84.0%,該基準測試了多模式的理解和推理。
Hassabis added, “We’re taking a bit of extra time to conduct more frontier safety evaluations and get further input from safety experts.” (Meaning: As for now, it is available to trusted testers via the API for feedback before the capability is made widely available.)
Hassabis補充說:“我們花了一些時間來進行更多的邊境安全評估,並從安全專家那裡獲得進一步的意見。” (意思是:到目前為止,它可以通過API受信任的測試人員進行反饋,然後才能廣泛使用該功能。)
Overall, the new 2.5 Pro leads popular coding leaderboard WebDev Arena, with an ELO score — which measures the relative skill level of players in two-player games like chess — of 1420 (intermediate to proficient). It also leads across all categories of the LMArena leaderboard, which evaluates AI based on human preference.
總體而言,新的2.5 Pro領導了流行的編碼排行榜WebDev競技場,其ELO得分(衡量了Chess等兩人遊戲中的玩家的相對技能水平)為1420(中級至熟練)。它還領導著LMARENA排行榜的所有類別,該排行榜基於人類的喜好評估AI。
Since its launch, “we’ve been really impressed by what [users have] created, from turning sketches into interactive apps to simulating entire cities,” said Hassabis.
自發布以來,“從將草圖變成交互式應用到模擬整個城市,我們對所創造的東西給我們留下了深刻的印象。”
Important updates to Gemini 2.5 Pro, Flash
Gemini 2.5 Pro的重要更新,Flash
Also today, Google announced an enhanced 2.5 Flash, considered its workhorse model designed for speed, efficiency and low cost. 2.5 Flash has been improved across the board in benchmarks for reasoning, multimodality, code and long context — Hassabis noted that it’s “second only” to 2.5 Pro on the LMArena leaderboard. The model is also more efficient, using 20 to 30% fewer tokens.
同樣,今天,Google宣布了一個增強的2.5閃存,被認為是其主力型號,專為速度,效率和低成本而設計。 2.5 Flash已通過基准進行了推理,多模式,代碼和延長的上下文的全面改進 - Hassabis指出,在LMARENA排行榜上,它是“僅第二個”。該模型使用的代幣少20%至30%。
Google is making final adjustments to 2.5 Flash based on developer feedback; it is now available for preview in Google AI Studio, Vertex AI and in the Gemini app. It will be generally available for production in early June.
Google正在根據開發人員反饋對2.5 Flash進行最終調整;現在可以在Google AI Studio,Vertex AI和Gemini應用程序中進行預覽。通常將在6月初進行生產。
Google is bringing additional capabilities to both Gemini 2.5 Pro and 2.5 Flash, including native audio output to create more natural conversational experiences, text-to-speech to support multiple speakers, thought summaries and thinking budgets.
Google正在為Gemini 2.5 Pro和2.5 Flash帶來其他功能,包括本機音頻輸出,以創造更多自然的對話體驗,文本到語音以支持多個揚聲器,思想摘要和思維預算。
With native audio input (in preview), users can steer Gemini’s tone, accent and style of speaking (think: directing the model to be melodramatic or maudlin when telling a story). Like Project Mariner, the model is also equipped with tool use, allowing it to search on users’ behalf.
使用本機音頻輸入(在預覽中),用戶可以引導雙子座的語氣,口音和說話風格(思考:在講故事時,將模型引導為旋律或Maudlin)。像Project Mariner一樣,該模型還配備了工具使用,使其可以代表用戶搜索。
Other experimental early voice features include affective dialogue, which gives the model the ability to detect emotion in user voice and respond accordingly; proactive audio that allows it to tune out background conversations; and thinking in the Live API to support more complex tasks.
其他實驗性的早期語音特徵包括情感對話,該對話使模型能夠在用戶語音中檢測情緒並做出相應響應。主動音頻,使其可以調整背景對話;並在實時API中思考以支持更複雜的任務。
New multiple-speaker features in both Pro and Flash support more than 24 languages, and the models can quickly switch from one dialect to another. “Text-to-speech is expressive and can capture subtle nuances, such as whispers,” Koray Kavukcuoglu, CTO of Google DeepMind, and Tulsee Doshi, senior director for product management at Google DeepMind, wrote in a blog posted today.
Pro和Flash支持24多種語言中的新的多揚聲器功能,模型可以快速從一個方言轉換為另一種方言。 Google DeepMind的CTO Koray Kavukcuoglu和Google DeepMind產品管理高級總監Tulsee Doshi在今天在一個博客中寫道:“文本到語音具有表現力,可以捕捉細微的細微差別,例如耳語。”
Further, 2.5 Pro and Flash now include thought summaries in the Gemini API and Vertex AI. These “take the model’s raw thoughts and organize them into a clear format with headers, key details, and information about model actions, like when they use tools,” Kavukcuoglu and Doshi explain. The goal is to provide a more structured, streamlined format for the model’s thinking
此外,2.5 Pro和Flash現在包括雙子API和Vertex AI中的思想摘要。這些“採用模型的原始思想,並將其整理成一個清晰的格式,其中包括標題,關鍵細節以及有關模型操作的信息,例如使用工具,” Kavukcuoglu和Doshi解釋說。目的是為模型的思維提供更結構化的簡化格式
免責聲明:info@kdj.com
所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大,建議您充分研究後謹慎投資!
如果您認為本網站使用的內容侵犯了您的版權,請立即聯絡我們(info@kdj.com),我們將及時刪除。
-
-
-
- Anticipation of an XRP ETF Is Building as Analysts Weigh Up Potential Inflows, Market Impacts, and Regulatory Dynamics
- 2025-06-10 04:10:12
- Anticipation over an XRP exchange-traded fund (ETF) is building in the crypto sector as analysts weigh up potential inflows, market impacts, and regulatory dynamics.
-
-
-
- 在廣泛的加密增長中
- 2025-06-10 04:05:14
- 在四月份增長2.12%並連續19個月的收益之後,Stablecoin市值達到2380億美元
-
- SHIB Tokenized Promise as New Market Trend Reversal Hints at Breakout
- 2025-06-10 04:00:26
- SHIB's recent price action has caught the eye of several technical analysts. One trader, SHIB KNIGHT, highlighted a falling wedge pattern on the SHIB/USDT chart. This setup usually indicates a reversal after a prolonged downtrend.
-
-
- Ruvi AI Carves Out a Leading Position in the Blockchain Industry by Merging Artificial Intelligence with Blockchain Technology
- 2025-06-10 03:55:13
- Ruvi AI stands out as a next-generation blockchain platform aimed at addressing pressing global challenges. By integrating blockchain's transparency with AI's analytical power, Ruvi AI provides intelligent solutions for industries such as supply chain management, fraud detection, and more.