$114723.978292 USD

-0.03%

ethereum

$3678.789219 USD

3.11%

xrp

$3.056026 USD

1.48%

tether

$1.000074 USD

0.00%

bnb

$765.960419 USD

1.33%

solana

$169.174506 USD

3.02%

usd-coin

$0.999957 USD

0.01%

tron

$0.334688 USD

2.20%

dogecoin

$0.208749 USD

3.04%

cardano

$0.753409 USD

2.00%

hyperliquid

$38.578846 USD

-0.28%

stellar

$0.410804 USD

-0.52%

sui

$3.557195 USD

1.75%

chainlink

$16.952615 USD

2.11%

bitcoin-cash

$571.636756 USD

3.88%

Cryptocurrency News Articles

Google is testing a new experimental mode for Gemini 2.5 Pro that adds deeper reasoning capabilities and native audio output

May 21, 2025 at 02:27 am

The new mode, called "Deep Think," is designed to help the model evaluate multiple hypotheses before answering a prompt. According to Google, it’s based on new research methods and is currently being tested with a limited group of Gemini API users.

Google is introducing deeper reasoning capabilities and native audio output to its Gemini 2.5 Pro model in an experimental mode called "Deep Think."

This new mode, which is still under testing with a limited group of Gemini API users, encourages the model to consider multiple hypotheses before arriving at an answer.

The technology behind Deep Think is based on new research methods at Google AI, and the company claims that Gemini 2.5 Pro with Deep Think outperforms OpenAI's o3 model on several benchmarks.

These include the USAMO 2025 math test, the LiveCodeBench programming benchmark, and MMMU, a test for multimodal reasoning.

Gemini 2.5 Flash, optimized for speed and efficiency, has also been updated with improved performance on reasoning, multimodal tasks, and code generation.

The latest version of Flash can perform the same tasks using 20 to 30 percent fewer tokens.

Both Gemini 2.5 Pro and Flash now support native text-to-speech with multiple speaker profiles. The voice output can capture subtle effects like whispers and emotional tone, and supports more than 24 languages.

Developers can control accent, tone, and speaking style through the Live API.

Two new features—"Affective Dialogue" and "Proactive Audio"—aim to make voice interactions feel more natural.

Affective Dialogue allows the model to detect emotion in a user's voice and respond accordingly—whether that’s neutrally, empathetically, or in a cheerful tone.

Proactive Audio helps filter out background conversations, so the AI only responds when it's directly addressed. The goal is to reduce accidental interactions and make voice control more reliable.

Google is also bringing features from Project Mariner into the Gemini API and Vertex AI, allowing the model to control computer applications like a web browser.

For developers, Gemini now includes "thought summaries", a structured view of the model’s internal reasoning and the actions it takes.

To manage performance, developers can configure "thinking budgets" to limit or disable the number of tokens the model uses for reasoning.

The Gemini API also now supports Anthropic's Model Context Protocol (MCP), which could make it easier to integrate with open-source tools.

Google is exploring hosted MCP servers to support agent-based application development.

Original source：the-decoder

Disclaimer:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research！

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

Other articles published on Aug 05, 2025