$104624.958266 USD

1.23%

ethereum

$2400.526310 USD

-3.31%

tether

$1.000143 USD

-0.01%

xrp

$2.375789 USD

0.61%

bnb

$641.909362 USD

-0.09%

solana

$166.682831 USD

-0.28%

usd-coin

$0.999864 USD

0.00%

dogecoin

$0.222645 USD

2.78%

cardano

$0.737120 USD

-0.79%

tron

$0.263106 USD

-3.66%

sui

$3.791619 USD

0.32%

chainlink

$15.304523 USD

-0.64%

avalanche

$22.181122 USD

-0.39%

stellar

$0.284427 USD

-0.95%

hyperliquid

$26.205797 USD

-0.73%

加密货币新闻

使用CXL的专用AI：统一KVCACHE重复使用和GPU内存扩展，以解决AI最持久的基础架构挑战之一

2025/05/19 21:15

随着AI的工作负载超出静态提示，转向动态上下文流，模型创建管道和长期运行的代理，基础架构也必须发展。

PEAK:AIO, a company that provides software-first infrastructure for next-generation AI data solutions, announced the launch of its 1U Token Memory Feature. This feature is designed to unify KVCache reuse and GPU memory expansion using CXL, addressing one of AI's most persistent infrastructure challenges.

Peak：AIO是一家为下一代AI数据解决方案提供软件优先基础架构的公司，宣布启动其1U令牌内存功能。此功能旨在使用CXL统一KVCACHE REUSE和GPU存储器扩展，从而解决了AI最持续的基础架构挑战之一。

As AI workloads evolve beyond static prompts into dynamic context streams, model creation pipelines, and long-running agents, there is a pressing need for infrastructure to evolve at an equal pace. However, vendors have been retrofitting legacy storage stacks or overextending NVMe to delay the inevitable as transformer models grow in size and context. This approach saturates the GPU and leads to performance degradation.

随着AI工作负载超出静态提示，转向动态上下文流，模型创建管道和长期运行的代理时，基础架构以相等的节奏迫切需要进化。但是，随着变压器模型在尺寸和上下文中的增长，供应商一直在改造旧存储堆栈或过度扩展NVME，以延迟不可避免的情况。这种方法使GPU饱和并导致性能降解。

"Whether you are deploying agents that think across sessions or scaling toward million-token context windows, where memory demands can exceed 500GB per fully loaded model, this appliance makes it possible by treating token history as memory, not storage. It is time for memory to scale like compute has," said Eyal Lemberger, Chief AI Strategist and Co-Founder of PEAK:AIO.

“无论您是在部署跨会话思考的代理还是向数百万的上下文窗口进行扩展，其中内存需求可能会超过每个满载模型的500GB，该设备都可以通过将令牌历史视为记忆，而不是存储。现在是计算像计算的记忆的时候了。”

In contrast to passive NVMe-based storage, PEAK:AIO's architecture is designed with direct alignment to NVIDIA's KVCache reuse and memory reclaim models, providing plug-in support for teams building on TensorRT-LLM or Triton. This support accelerates inference with minimal integration effort. Furthermore, by harnessing true CXL memory-class performance, it delivers what others cannot: token memory that behaves like RAM, not files.

与被动NVME的存储相反，Peak：AIO的体系结构是直接与NVIDIA的KVCACHE REUSE和内存回收模型进行的，为在Tensorrt-LLM或Triton上建立的团队提供插件支持。这种支持会加速推断最少的集成工作。此外，通过利用真正的CXL内存级表现，它可以提供其他人无法的：表现为RAM，而不是文件的记忆。

"While others are bending file systems to act like memory, we built infrastructure that behaves like memory, because that is what modern AI needs. At scale, it is not about saving files; it is about keeping every token accessible in microseconds. That is a memory problem, and we solved it at embracing the latest silicon layer," Lemberger explained.

“尽管其他人弯曲了文件系统的作用，但我们构建了像内存一样行为的基础架构，因为这是现代AI所需的。在大规模上，它不是要保存文件；这是在微秒中保持每个令牌。这是一个记忆问题，这是一个记忆问题，我们在拥抱最新的Silicon Layers上解决了它。”

The fully software-defined solution utilizes standard, off-the-shelf servers and is expected to enter production by Q3. For early access, technical consultation, or to learn more about how PEAK:AIO can support any level of AI infrastructure needs, please contact sales at sales@peakaio.com or visit https://peakaio.com.

完全软件定义的解决方案利用标准，现成的服务器，预计将按Q3输入生产。有关早期访问，技术咨询或了解有关高峰的更多信息：AIO可以支持任何水平的AI基础架构需求，请通过sales@peakaio.com与销售联系或访问https://peakaio.com。

"The big vendors are stacking NVMe to fake memory. We went the other way, leveraging CXL to unlock actual memory semantics at rack scale. This is the token memory fabric modern AI has been waiting for," added Mark Klarzynski, Co-Founder and Chief Strategy Officer at PEAK:AIO.

Mark Klarzynski补充说：“大供应商正在将NVME堆叠到假内存中。我们采取了另一种方式，利用CXL在机架规模上解锁实际的内存语义。这是令牌记忆织物现代AI一直在等待。”

About PEAK:AIO

关于高峰：AIO

PEAK:AIO is a software-first infrastructure company delivering next-generation AI data solutions. Trusted across global healthcare, pharmaceutical, and enterprise AI deployments, PEAK:AIO powers real-time, low-latency inference and training with memory-class performance, GPUDirect RDMA acceleration, and zero-maintenance deployment models. Learn more at https://peakaio.com

Peak：AIO是一家提供下一代AI数据解决方案的软件优先基础架构公司。在全球医疗保健，制药和企业AI部署中受到信任，高峰：AIO为实时，低延迟推断和记忆级绩效，GPUDIRECT RDMA加速度和零微生保养部署模型提供了支持。在https://peakaio.com上了解更多信息

免责声明:info@kdj.com

所提供的信息并非交易建议。根据本文提供的信息进行的任何投资，kdj.com不承担任何责任。加密货币具有高波动性，强烈建议您深入研究后，谨慎投资！

如您认为本网站上使用的内容侵犯了您的版权，请立即联系我们（info@kdj.com），我们将及时删除。

2025年05月20日发表的其他文章