$116671.700731 USD

-0.07%

ethereum

$4614.067903 USD

2.14%

xrp

$3.088291 USD

1.49%

tether

$1.000362 USD

-0.01%

bnb

$987.229886 USD

2.93%

solana

$245.931058 USD

3.98%

usd-coin

$0.999926 USD

-0.02%

dogecoin

$0.282081 USD

4.73%

cardano

$0.916372 USD

4.08%

tron

$0.343952 USD

0.28%

hyperliquid

$58.838953 USD

8.45%

chainlink

$23.998618 USD

2.02%

ethena-usde

$1.001077 USD

-0.02%

avalanche

$32.209027 USD

7.08%

sui

$3.800649 USD

5.65%

加密貨幣新聞文章

Apple 和 NVIDIA 合作利用大型語言模型實現更快的文字生成效能

2024/12/19 05:33

在今天的一篇部落格文章中，Apple 工程師分享了與 NVIDIA 合作的新細節，以利用大型語言模型實現更快的文字生成效能。

Apple engineers have shared new details on a collaboration with NVIDIA to implement faster text generation performance with large language models (LLMs).

Apple 工程師分享了與 NVIDIA 合作的新細節，以透過大型語言模型 (LLM) 實現更快的文字生成效能。

Earlier this year, Apple published and open sourced its Recurrent Drafter (ReDrafter) technique, a new method for generating text with LLMs that’s significantly faster and “achieves state of the art performance.” It combines two techniques: beam search (to explore multiple possibilities) and dynamic tree attention (to efficiently handle choices).

今年早些時候，Apple 發布並開源了其 Recurrent Drafter (ReDrafter) 技術，這是一種使用法學碩士生成文本的新方法，速度明顯更快，並且「實現了最先進的性能」。它結合了兩種技術：集束搜尋（探索多種可能性）和動態樹注意力（有效處理選擇）。

While its research demonstrated strong results, Apple also collaborated with NVIDIA to apply ReDrafter in production. As part of this collaboration, ReDrafter was integrated into NVIDIA TensorRT-LLM, a tool that helps run LLMs faster on NVIDIA GPUs.

儘管其研究成果斐然，Apple 也與 NVIDIA 合作，將 ReDrafter 應用到生產中。作為合作的一部分，ReDrafter 被整合到 NVIDIA TensorRT-LLM 中，該工具有助於在 NVIDIA GPU 上更快地運行 LLM。

Here are the results:

結果如下：

To enable the integration of ReDrafter, NVIDIA added new operators or exposed existing ones, which considerably improved TensorRT-LLM’s capability to accommodate sophisticated models and decoding methods. ML developers using NVIDIA GPUs can now easily benefit from ReDrafter’s accelerated token generation for their production LLM applications with TensorRT-LLM.

為了實現 ReDrafter 的集成，NVIDIA 增加了新的運算子或公開了現有的運算符，這大大提高了 TensorRT-LLM 適應複雜模型和解碼方法的能力。使用 NVIDIA GPU 的 ML 開發人員現在可以輕鬆受益於 ReDrafter 的加速令牌生成，以使用 TensorRT-LLM 為其生產 LLM 應用程式。

In benchmarking a tens-of-billions parameter production model on NVIDIA GPUs, using the NVIDIA TensorRT-LLM inference acceleration framework with ReDrafter, we have seen 2.7x speed-up in generated tokens per second for greedy decoding. These benchmark results indicate this tech could significantly reduce latency users may experience, while also using fewer GPUs and consuming less power.

在 NVIDIA GPU 上對數百億個參數生產模型進行基準測試時，使用 NVIDIA TensorRT-LLM 推理加速框架和 ReDrafter，我們發現每秒生成的貪婪解碼令牌速度提高了 2.7 倍。這些基準測試結果表明，這項技術可以顯著減少用戶可能遇到的延遲，同時使用更少的 GPU 並消耗更少的電量。

“LLMs are increasingly being used to power production applications, and improving inference efficiency can both impact computational costs and reduce latency for users,” Apple’s machine learning researchers conclude. “With ReDrafter’s novel approach to speculative decoding integrated into the NVIDIA TensorRT-LLM framework, developers can now benefit from faster token generation on NVIDIA GPUs for their production LLM applications.”

蘋果機器學習研究人員總結道：“法學碩士越來越多地用於為生產應用程式提供支持，提高推理效率既可以影響計算成本，又可以減少用戶的延遲。” 「透過將 ReDrafter 新穎的推測性解碼方法整合到 NVIDIA TensorRT-LLM 框架中，開發人員現在可以在 NVIDIA GPU 上為其生產 LLM 應用程式更快地生成令牌，從而受益。”

You can learn more about this work on Apple’s website and in a blog post on NVIDIA’s website.

您可以在 Apple 網站和 NVIDIA 網站上的部落格文章中了解有關這項工作的更多資訊。

原始來源：9to5mac

免責聲明:info@kdj.com

所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大，建議您充分研究後謹慎投資！

如果您認為本網站使用的內容侵犯了您的版權，請立即聯絡我們（info@kdj.com），我們將及時刪除。

2025年09月18日其他文章發表於