$104624.958266 USD

1.23%

ethereum

$2400.526310 USD

-3.31%

tether

$1.000143 USD

-0.01%

xrp

$2.375789 USD

0.61%

bnb

$641.909362 USD

-0.09%

solana

$166.682831 USD

-0.28%

usd-coin

$0.999864 USD

0.00%

dogecoin

$0.222645 USD

2.78%

cardano

$0.737120 USD

-0.79%

tron

$0.263106 USD

-3.66%

sui

$3.791619 USD

0.32%

chainlink

$15.304523 USD

-0.64%

avalanche

$22.181122 USD

-0.39%

stellar

$0.284427 USD

-0.95%

hyperliquid

$26.205797 USD

-0.73%

暗号通貨のニュース記事

AI用の専用：CXLを使用したKVCacheの再利用とGPUメモリの拡張を統合して、AIの最も永続的なインフラストラクチャの課題の1つに対処する

2025/05/19 21:15

AIワークロードは、静的プロンプトを超えて動的コンテキストストリーム、モデル作成パイプライン、および長期にわたるエージェントに進化するため、インフラストラクチャも進化する必要があります。

PEAK:AIO, a company that provides software-first infrastructure for next-generation AI data solutions, announced the launch of its 1U Token Memory Feature. This feature is designed to unify KVCache reuse and GPU memory expansion using CXL, addressing one of AI's most persistent infrastructure challenges.

ピーク：次世代のAIデータソリューションのためにソフトウェアファーストインフラストラクチャを提供するAIOは、1Uトークンメモリ機能の開始を発表しました。この機能は、CXLを使用してKVCacheの再利用とGPUメモリの拡張を統合するように設計されており、AIの最も永続的なインフラストラクチャの課題の1つに対処します。

As AI workloads evolve beyond static prompts into dynamic context streams, model creation pipelines, and long-running agents, there is a pressing need for infrastructure to evolve at an equal pace. However, vendors have been retrofitting legacy storage stacks or overextending NVMe to delay the inevitable as transformer models grow in size and context. This approach saturates the GPU and leads to performance degradation.

AIワークロードは、静的プロンプトを超えて動的コンテキストストリーム、モデル作成パイプライン、および長期にわたるエージェントに進化するにつれて、インフラストラクチャが平等に進化する必要があります。ただし、ベンダーはレガシーストレージスタックを改造したり、NVMEを過剰にして、変圧器モデルがサイズとコンテキストが成長するにつれて避けられないことを遅らせています。このアプローチはGPUを飽和させ、パフォーマンスの低下につながります。

"Whether you are deploying agents that think across sessions or scaling toward million-token context windows, where memory demands can exceed 500GB per fully loaded model, this appliance makes it possible by treating token history as memory, not storage. It is time for memory to scale like compute has," said Eyal Lemberger, Chief AI Strategist and Co-Founder of PEAK:AIO.

「セッション全体で考えているエージェントを展開したり、メモリ需要が完全にロードされたモデルごとに500GBを超えることができる100万のコンテキストウィンドウに向かってスケーリングする場合でも、このアプライアンスはトークンの履歴をメモリではなくメモリとして扱うことで可能になります。

In contrast to passive NVMe-based storage, PEAK:AIO's architecture is designed with direct alignment to NVIDIA's KVCache reuse and memory reclaim models, providing plug-in support for teams building on TensorRT-LLM or Triton. This support accelerates inference with minimal integration effort. Furthermore, by harnessing true CXL memory-class performance, it delivers what others cannot: token memory that behaves like RAM, not files.

パッシブNVMEベースのストレージとは対照的に、ピーク：AIOのアーキテクチャは、NvidiaのKVCacheの再利用およびメモリ回復モデルに直接合わせて設計され、Tensort-llmまたはTritonに構築されたチームのプラグインサポートを提供します。このサポートは、最小限の統合努力で推論を加速します。さらに、真のCXLメモリクラスのパフォーマンスを活用することにより、他の人ができないものを提供します。ファイルではなくRAMのように振る舞うトークンメモリ。

"While others are bending file systems to act like memory, we built infrastructure that behaves like memory, because that is what modern AI needs. At scale, it is not about saving files; it is about keeping every token accessible in microseconds. That is a memory problem, and we solved it at embracing the latest silicon layer," Lemberger explained.

「他の人はメモリのように動作するファイルシステムを曲げていますが、それが現代のAIが必要とするものであるため、メモリのように振る舞うインフラストラクチャを構築しました。それは大規模にファイルを保存することではありません。それはマイクロ秒ですべてのトークンをアクセス可能に保つことです。それはメモリの問題であり、最新のシリコン層を採用することでそれを解決しました」とLembergerは説明しました。

The fully software-defined solution utilizes standard, off-the-shelf servers and is expected to enter production by Q3. For early access, technical consultation, or to learn more about how PEAK:AIO can support any level of AI infrastructure needs, please contact sales at sales@peakaio.com or visit https://peakaio.com.

完全にソフトウェア定義されたソリューションは、標準の既製のサーバーを利用しており、第3四半期までに生産に入ることが期待されています。早期アクセス、技術的相談、またはピークの詳細については、AIインフラストラクチャのニーズのレベルをサポートする方法については、sales@peakaio.comで販売に連絡するか、https：//peakaio.comにアクセスしてください。

"The big vendors are stacking NVMe to fake memory. We went the other way, leveraging CXL to unlock actual memory semantics at rack scale. This is the token memory fabric modern AI has been waiting for," added Mark Klarzynski, Co-Founder and Chief Strategy Officer at PEAK:AIO.

「大手ベンダーはNVMEを偽のメモリに積み重ねています。私たちは、ラックスケールで実際のメモリセマンティクスのロックを解除するためにCXLを活用して、モダンAIが待ち望んでいたトークンメモリファブリックです。

About PEAK:AIO

ピークについて：AIO

PEAK:AIO is a software-first infrastructure company delivering next-generation AI data solutions. Trusted across global healthcare, pharmaceutical, and enterprise AI deployments, PEAK:AIO powers real-time, low-latency inference and training with memory-class performance, GPUDirect RDMA acceleration, and zero-maintenance deployment models. Learn more at https://peakaio.com

ピーク：AIOは、次世代のAIデータソリューションを提供するソフトウェアファーストインフラストラクチャ企業です。グローバルなヘルスケア、製薬、およびエンタープライズAIの展開、ピーク：AIOは、メモリクラスのパフォーマンス、GPudirect RDMA加速度、およびメンテナンスの展開ゼロ展開モデルを備えたリアルタイム、低遅延の推論とトレーニング全体で信頼しています。詳細については、https：//peakaio.comをご覧ください

免責事項:info@kdj.com

提供される情報は取引に関するアドバイスではありません。 kdj.com は、この記事で提供される情報に基づいて行われた投資に対して一切の責任を負いません。暗号通貨は変動性が高いため、十分な調査を行った上で慎重に投資することを強くお勧めします。

このウェブサイトで使用されているコンテンツが著作権を侵害していると思われる場合は、直ちに当社 (info@kdj.com) までご連絡ください。速やかに削除させていただきます。

2025年05月20日に掲載されたその他の記事

もっと