$104624.958266 USD

1.23%

ethereum

$2400.526310 USD

-3.31%

tether

$1.000143 USD

-0.01%

xrp

$2.375789 USD

0.61%

bnb

$641.909362 USD

-0.09%

solana

$166.682831 USD

-0.28%

usd-coin

$0.999864 USD

0.00%

dogecoin

$0.222645 USD

2.78%

cardano

$0.737120 USD

-0.79%

tron

$0.263106 USD

-3.66%

sui

$3.791619 USD

0.32%

chainlink

$15.304523 USD

-0.64%

avalanche

$22.181122 USD

-0.39%

stellar

$0.284427 USD

-0.95%

hyperliquid

$26.205797 USD

-0.73%

加密貨幣新聞文章

使用CXL的專用AI：統一KVCACHE重複使用和GPU內存擴展，以解決AI最持久的基礎架構挑戰之一

2025/05/19 21:15

隨著AI的工作負載超出靜態提示，轉向動態上下文流，模型創建管道和長期運行的代理，基礎架構也必鬚髮展。

PEAK:AIO, a company that provides software-first infrastructure for next-generation AI data solutions, announced the launch of its 1U Token Memory Feature. This feature is designed to unify KVCache reuse and GPU memory expansion using CXL, addressing one of AI's most persistent infrastructure challenges.

Peak：AIO是一家為下一代AI數據解決方案提供軟件優先基礎架構的公司，宣布啟動其1U令牌內存功能。此功能旨在使用CXL統一KVCACHE REUSE和GPU存儲器擴展，從而解決了AI最持續的基礎架構挑戰之一。

As AI workloads evolve beyond static prompts into dynamic context streams, model creation pipelines, and long-running agents, there is a pressing need for infrastructure to evolve at an equal pace. However, vendors have been retrofitting legacy storage stacks or overextending NVMe to delay the inevitable as transformer models grow in size and context. This approach saturates the GPU and leads to performance degradation.

隨著AI工作負載超出靜態提示，轉向動態上下文流，模型創建管道和長期運行的代理時，基礎架構以相等的節奏迫切需要進化。但是，隨著變壓器模型在尺寸和上下文中的增長，供應商一直在改造舊存儲堆棧或過度擴展NVME，以延遲不可避免的情況。這種方法使GPU飽和並導致性能降解。

"Whether you are deploying agents that think across sessions or scaling toward million-token context windows, where memory demands can exceed 500GB per fully loaded model, this appliance makes it possible by treating token history as memory, not storage. It is time for memory to scale like compute has," said Eyal Lemberger, Chief AI Strategist and Co-Founder of PEAK:AIO.

“無論您是在部署跨會話思考的代理還是向數百萬的上下文窗口進行擴展，其中內存需求可能會超過每個滿載模型的500GB，該設備都可以通過將令牌歷史視為記憶，而不是存儲。現在是計算像計算的記憶的時候了。”

In contrast to passive NVMe-based storage, PEAK:AIO's architecture is designed with direct alignment to NVIDIA's KVCache reuse and memory reclaim models, providing plug-in support for teams building on TensorRT-LLM or Triton. This support accelerates inference with minimal integration effort. Furthermore, by harnessing true CXL memory-class performance, it delivers what others cannot: token memory that behaves like RAM, not files.

與被動NVME的存儲相反，Peak：AIO的體系結構是直接與NVIDIA的KVCACHE REUSE和內存回收模型進行的，為在Tensorrt-LLM或Triton上建立的團隊提供插件支持。這種支持會加速推斷最少的集成工作。此外，通過利用真正的CXL內存級表現，它可以提供其他人無法的：表現為RAM，而不是文件的記憶。

"While others are bending file systems to act like memory, we built infrastructure that behaves like memory, because that is what modern AI needs. At scale, it is not about saving files; it is about keeping every token accessible in microseconds. That is a memory problem, and we solved it at embracing the latest silicon layer," Lemberger explained.

“儘管其他人彎曲了文件系統的作用，但我們構建了像內存一樣行為的基礎架構，因為這是現代AI所需的。在大規模上，它不是要保存文件；這是在微秒中保持每個令牌。這是一個記憶問題，這是一個記憶問題，我們在擁抱最新的Silicon Layers上解決了它。”

The fully software-defined solution utilizes standard, off-the-shelf servers and is expected to enter production by Q3. For early access, technical consultation, or to learn more about how PEAK:AIO can support any level of AI infrastructure needs, please contact sales at sales@peakaio.com or visit https://peakaio.com.

完全軟件定義的解決方案利用標準，現成的服務器，預計將按Q3輸入生產。有關早期訪問，技術諮詢或了解有關高峰的更多信息：AIO可以支持任何水平的AI基礎架構需求，請通過sales@peakaio.com與銷售聯繫或訪問https://peakaio.com。

"The big vendors are stacking NVMe to fake memory. We went the other way, leveraging CXL to unlock actual memory semantics at rack scale. This is the token memory fabric modern AI has been waiting for," added Mark Klarzynski, Co-Founder and Chief Strategy Officer at PEAK:AIO.

Mark Klarzynski補充說：“大供應商正在將NVME堆疊到假內存中。我們採取了另一種方式，利用CXL在機架規模上解鎖實際的內存語義。這是令牌記憶織物現代AI一直在等待。”

About PEAK:AIO

關於高峰：AIO

PEAK:AIO is a software-first infrastructure company delivering next-generation AI data solutions. Trusted across global healthcare, pharmaceutical, and enterprise AI deployments, PEAK:AIO powers real-time, low-latency inference and training with memory-class performance, GPUDirect RDMA acceleration, and zero-maintenance deployment models. Learn more at https://peakaio.com

Peak：AIO是一家提供下一代AI數據解決方案的軟件優先基礎架構公司。在全球醫療保健，製藥和企業AI部署中受到信任，高峰：AIO為實時，低延遲推斷和記憶級績效，GPUDIRECT RDMA加速度和零微生保養部署模型提供了支持。在https://peakaio.com上了解更多信息

免責聲明:info@kdj.com

所提供的資訊並非交易建議。 kDJ.com對任何基於本文提供的資訊進行的投資不承擔任何責任。加密貨幣波動性較大，建議您充分研究後謹慎投資！

如果您認為本網站使用的內容侵犯了您的版權，請立即聯絡我們（info@kdj.com），我們將及時刪除。

2025年05月20日其他文章發表於