$104624.958266 USD

1.23%

ethereum

$2400.526310 USD

-3.31%

tether

$1.000143 USD

-0.01%

xrp

$2.375789 USD

0.61%

bnb

$641.909362 USD

-0.09%

solana

$166.682831 USD

-0.28%

usd-coin

$0.999864 USD

0.00%

dogecoin

$0.222645 USD

2.78%

cardano

$0.737120 USD

-0.79%

tron

$0.263106 USD

-3.66%

sui

$3.791619 USD

0.32%

chainlink

$15.304523 USD

-0.64%

avalanche

$22.181122 USD

-0.39%

stellar

$0.284427 USD

-0.95%

hyperliquid

$26.205797 USD

-0.73%

암호화폐 뉴스 기사

AI에 대한 목적 : AI의 가장 지속적인 인프라 문제 중 하나를 해결하기 위해 CXL을 사용하여 KVCache 재사용 및 GPU 메모리 확장을 통일

2025/05/19 21:15

AI 워크로드가 정적 프롬프트를 넘어 동적 컨텍스트 스트림, 모델 생성 파이프 라인 및 장기 실행 에이전트로 발전함에 따라 인프라도 발전해야합니다.

PEAK:AIO, a company that provides software-first infrastructure for next-generation AI data solutions, announced the launch of its 1U Token Memory Feature. This feature is designed to unify KVCache reuse and GPU memory expansion using CXL, addressing one of AI's most persistent infrastructure challenges.

PEAK : 차세대 AI 데이터 솔루션에 소프트웨어 우선 인프라를 제공하는 회사 인 AIO는 1U 토큰 메모리 기능을 출시했다고 발표했습니다. 이 기능은 CXL을 사용하여 KVCache 재사용 및 GPU 메모리 확장을 통합하여 AI의 가장 지속적인 인프라 문제 중 하나를 해결하도록 설계되었습니다.

As AI workloads evolve beyond static prompts into dynamic context streams, model creation pipelines, and long-running agents, there is a pressing need for infrastructure to evolve at an equal pace. However, vendors have been retrofitting legacy storage stacks or overextending NVMe to delay the inevitable as transformer models grow in size and context. This approach saturates the GPU and leads to performance degradation.

AI 워크로드가 정적 프롬프트를 넘어 동적 컨텍스트 스트림, 모델 제작 파이프 라인 및 장기 실행 에이전트로 발전함에 따라 인프라가 동일한 속도로 발전해야 할 필요성이 있습니다. 그러나 벤더는 레거시 스토리지 스택을 개조하거나 NVME를 과도하게 확장하여 변압기 모델이 크기와 컨텍스트가 증가함에 따라 피할 수없는 것을 지연시켜 왔습니다. 이 접근법은 GPU를 포화시키고 성능 저하로 이어집니다.

"Whether you are deploying agents that think across sessions or scaling toward million-token context windows, where memory demands can exceed 500GB per fully loaded model, this appliance makes it possible by treating token history as memory, not storage. It is time for memory to scale like compute has," said Eyal Lemberger, Chief AI Strategist and Co-Founder of PEAK:AIO.

"메모리 요구가 완전히로드 된 모델 당 500GB를 초과 할 수있는 백만 곡의 컨텍스트 창으로 스케일링하는 에이전트를 배치하든,이 어플라이언스는 메모리를 스토리지가 아닌 메모리로 취급함으로써 가능합니다.

In contrast to passive NVMe-based storage, PEAK:AIO's architecture is designed with direct alignment to NVIDIA's KVCache reuse and memory reclaim models, providing plug-in support for teams building on TensorRT-LLM or Triton. This support accelerates inference with minimal integration effort. Furthermore, by harnessing true CXL memory-class performance, it delivers what others cannot: token memory that behaves like RAM, not files.

수동 NVME 기반 스토리지와 달리 Peak : Aio의 아키텍처는 NVIDIA의 KVCACHE 재사용 및 메모리 재사용 모델과 직접 정렬하여 설계되어 Tensorrt-LLM 또는 Triton을 구축하는 팀에 대한 플러그인 지원을 제공합니다. 이 지원은 최소한의 통합 노력으로 추론을 가속화합니다. 또한 실제 CXL 메모리 클래스 성능을 활용함으로써 다른 사람들이 할 수없는 것을 전달합니다 : 파일이 아닌 RAM처럼 작동하는 토큰 메모리.

"While others are bending file systems to act like memory, we built infrastructure that behaves like memory, because that is what modern AI needs. At scale, it is not about saving files; it is about keeping every token accessible in microseconds. That is a memory problem, and we solved it at embracing the latest silicon layer," Lemberger explained.

"다른 사람들은 메모리처럼 행동하기 위해 파일 시스템을 구부리고 있지만, 우리는 메모리처럼 행동하는 인프라를 구축했습니다. 왜냐하면 그것은 현대의 AI가 필요로하는 것입니다. 규모는 파일을 저장하는 것이 아닙니다. 그것은 모든 토큰에 마이크로 초에 액세스 할 수있는 상태를 유지하는 것입니다. 그것은 메모리 문제이며, 최신 실리콘 레이어를 수용하는 데 그것을 해결했습니다."라고 Lemberger는 설명했습니다.

The fully software-defined solution utilizes standard, off-the-shelf servers and is expected to enter production by Q3. For early access, technical consultation, or to learn more about how PEAK:AIO can support any level of AI infrastructure needs, please contact sales at sales@peakaio.com or visit https://peakaio.com.

완전한 소프트웨어 정의 솔루션은 표준의 상용 서버를 사용하며 Q3까지 생산을 입력 할 것으로 예상됩니다. 조기 액세스, 기술 상담 또는 Peak : AIO가 AI 인프라 요구 수준을 지원할 수있는 방법에 대한 자세한 내용은 Sales@peakaio.com으로 판매하거나 https://peakaio.com을 방문하십시오.

"The big vendors are stacking NVMe to fake memory. We went the other way, leveraging CXL to unlock actual memory semantics at rack scale. This is the token memory fabric modern AI has been waiting for," added Mark Klarzynski, Co-Founder and Chief Strategy Officer at PEAK:AIO.

"대형 공급 업체는 NVME를 가짜 메모리에 쌓고 있습니다. 우리는 랙 스케일에서 실제 메모리 시맨틱을 잠금 해제하기 위해 CXL을 활용하여 다른 방법으로갔습니다. 이는 Token Memory Fabric Modern AI가 기다리고 있습니다.

About PEAK:AIO

피크 소개 : AIO

PEAK:AIO is a software-first infrastructure company delivering next-generation AI data solutions. Trusted across global healthcare, pharmaceutical, and enterprise AI deployments, PEAK:AIO powers real-time, low-latency inference and training with memory-class performance, GPUDirect RDMA acceleration, and zero-maintenance deployment models. Learn more at https://peakaio.com

피크 : AIO는 차세대 AI 데이터 솔루션을 제공하는 소프트웨어 우선 인프라 회사입니다. 글로벌 의료, 제약 및 엔터프라이즈 AI 배포에서 신뢰할 수있는 Peak : AIO는 실시간, 저도의 추론 및 메모리 클래스 성능, Gpudirect RDMA 가속도 및 제로 유지 보수 배포 모델을 통한 교육을 발휘합니다. https://peakaio.com에서 자세히 알아보십시오

부인 성명:info@kdj.com

제공된 정보는 거래 조언이 아닙니다. kdj.com은 이 기사에 제공된 정보를 기반으로 이루어진 투자에 대해 어떠한 책임도 지지 않습니다. 암호화폐는 변동성이 매우 높으므로 철저한 조사 후 신중하게 투자하는 것이 좋습니다!

2025年05月20日 에 게재된 다른 기사

더