$105398.502299 USD

1.75%

ethereum

$2555.207592 USD

3.43%

tether

$1.000429 USD

-0.02%

xrp

$2.141971 USD

2.09%

bnb

$651.827388 USD

1.41%

solana

$146.611988 USD

2.90%

usd-coin

$0.999805 USD

-0.01%

dogecoin

$0.177273 USD

3.19%

tron

$0.271470 USD

0.86%

cardano

$0.634997 USD

1.86%

hyperliquid

$41.657613 USD

9.72%

sui

$3.026449 USD

2.34%

bitcoin-cash

$444.966315 USD

11.29%

chainlink

$13.256001 USD

2.72%

unus-sed-leo

$9.032403 USD

1.94%

暗号通貨のニュース記事

次のトークンを予測するだけでなく、言葉の背後にある概念を学ぶ

2025/06/12 13:32

Cocomix（Jihoon et al。、2025）のような努力は、メタによると、概念学習、すなわち、次のトークンを予測するのではなく、単語の背後にある概念を学習しました。

In the dynamic sphere of artificial intelligence, a persistent pursuit has been the development of language models capable not only of syntactic analysis but also of semantic comprehension, enabling them to engage in conversations on a conceptual level. This capability, often termed "conceptual learning," stands in contrast to the shallower analysis that focuses on predicting the next token in a sequence.

人工知能の動的な領域では、構文分析だけでなくセマンティック理解も可能な言語モデルの開発であり、概念レベルで会話に従事できるようになりました。しばしば「概念学習」と呼ばれるこの能力は、シーケンスで次のトークンを予測することに焦点を当てたより浅い分析とは対照的です。

While efforts like CoCoMix (Jihoon et al., 2025)¹ by Meta have brought us closer to this goal, introducing models that are remarkably steerable and interpretable, another core question arises. Even a conceptually brilliant model could struggle with nuanced or factual recall challenges after training, during actual deployment.

Cocomix（Jihoon et al。、2025）のような努力がこの目標に近づき、非常に操られて解釈可能なモデルを導入している一方で、別の核となる疑問が生じます。概念的に素晴らしいモデルでさえ、実際の展開中に、トレーニング後の微妙なまたは事実のリコールの課題に苦労する可能性があります。

Imagine asking a seemingly simple question like, “Earlier in our 2-million-token conversation, where did we discuss Pinocchio’s famously growing nose?” No matter how conceptually capable the LLM is, it cannot answer this simple question if the answer lies outside its context window.

「私たちの200万のトークンの会話の早い段階で、ピノキオの有名な成長している鼻についてどこで議論したのか」など、一見簡単な質問をすることを想像してみてください。 LLMがどれほど概念的に有能であっても、答えがコンテキストウィンドウの外にある場合、この簡単な質問に答えることはできません。

But this is precisely the kind of adaptability that humans effortlessly display. We can engage in a conversation about 19th-century Impressionist art, quickly recall a story from earlier in the day, and then seamlessly transition to discussing the best route to avoid traffic. A human guide could quickly glance at a map and suggest a clever alley shortcut, something a GPS system would struggle with despite knowing the shortest path.

しかし、これはまさに人間が楽に表示する種類の適応性です。 19世紀の印象派の芸術について会話をすることができ、その日の早い段階から物語をすぐに思い出し、トラフィックを避けるための最良のルートについて議論するためにシームレスに移行できます。人間のガイドは、地図をすぐにちらっと見て、巧妙な路地のショートカットを提案することができます。

This ability to integrate new information and experiences into an ongoing narrative, adjusting plans and adapting to unexpected events, is crucial for meaningful communication and interaction with the world around us.

新しい情報と経験を継続的な物語に統合し、計画を調整し、予期しない出来事に適応するこの能力は、私たちの周りの世界との意味のあるコミュニケーションと相互作用に不可欠です。

Now, a team of researchers at Google, in collaboration with researchers from Stanford University and the University of California, Irvine, has taken a significant step toward equipping large language models with this adaptable “memory” or performance boost precisely when it counts—during inference. Their findings are published in the journal Patterns.

現在、Googleの研究者チームは、スタンフォード大学やカリフォルニア大学アーバインの研究者と協力して、この適応可能な「記憶」またはパフォーマンスをカウントするときに正確に上昇させ、推論を装備するために大きな一歩を踏み出しました。彼らの調査結果は、ジャーナルパターンに掲載されています。

Their research builds upon the groundbreaking work in introducing the Transformer architecture (Vaswani et al., 2017)², which quickly became ubiquitous in the modern AI landscape.

彼らの研究は、変圧器アーキテクチャを導入するための画期的な作業に基づいています（Vaswani et al。、2017）²は、すぐに現代のAIの風景に遍在するようになりました。

From the breakout success of Transformers and the surprising results of applying attention to various domains—vision tasks with Transformers (Dosovitskiy et al., 2020)³, time series forecasting with Transformers (Zerveas et al., 2021)⁴, and the remarkable performance of Transformers in natural language processing (Rogers et al., 2021)⁵—the researchers went deeper.

変圧器のブレイクアウトの成功と、変圧器との視界タスク（Dosovitskiy et al。、2020）、トランスによる時系列予測（Zerveas et al。、2021）⁴、および自然言語処理における変圧器の顕著なパフォーマンス（Rogers et al。、Logers et al。 -dieper。

As the reliance on large models deepened and compute budgets expanded, even this “do it all” architecture began to show its limits, and so began the push to stretch its capabilities even further.

大規模なモデルへの依存が深まり、予算の計算が拡大するにつれて、この「やる」アーキテクチャでさえもその限界を示し始めたため、その機能をさらに拡大するためのプッシュを開始しました。

The bottleneck was attention’s ‘everyone-talks-to-everyone’ approach. Brilliantly efficient but quadratically expensive—imagine a room of a million people, where each person must remember every conversation with everyone. This restricted Transformers to a narrow “working memory,” struggling with the “long-term recall” needed for understanding vast documents, as early information simply faded away.

ボトルネックは、注目の「誰もがすべての人から」アプローチでした。見事に効率的ですが、二次的に高価です。100万人の部屋を想像してください。これにより、トランスフォーマーは狭い「作業記憶」に制限され、膨大な文書を理解するために必要な「長期的なリコール」に苦労しています。

Moreover, vanilla transformers faced another fundamental hurdle—a lack of adaptability after training. While they excelled at applying their vast pre-trained knowledge to predict the next token, a process of sophisticated reasoning and prediction, this was not the same as true learning.

さらに、バニラトランスフォーマーは別の基本的なハードルに直面しました。これは、トレーニング後の適応性の欠如です。彼らは、洗練された推論と予測のプロセスである次のトークンを予測するために、膨大な訓練を受けた知識を適用することに優れていましたが、これは真の学習と同じではありませんでした。

Like Google Maps, which quickly finds the shortest path but then wants you to drive through barricades because of ongoing construction, despite a human guide immediately suggesting a simple alley shortcut, transformers struggled to integrate new information into their existing knowledge.

Google Mapsのように、すぐに最短の道を見つけますが、継続的な構造のためにバリケードを駆け抜けることを望んでいます。人間のガイドがすぐに簡単な路地のショートカットを示唆しているにもかかわらず、Transformersは新しい情報を既存の知識に統合するのに苦労しました。

This inability to “learn on the fly” from the data they are currently processing, adjusting their strategies and memories, represents a critical limitation for tasks requiring continuous adaptation or memory of novel experiences beyond the training set.

現在処理しているデータから「戦略と記憶を調整しているデータから「その場で学ぶ」ことができないことは、トレーニングセットを超えた新しい経験の継続的な適応または記憶を必要とするタスクの重大な制限を表しています。

Instead of focusing narrowly on one limitation, the researchers took a broader perspective: how do intelligent systems, like the human brain, manage memory and adapt to new situations? It’s not about having one massive, ever-accessible memory; it’s a more flexible setup, where different components coordinate to handle different kinds of information and experiences.

研究者は、1つの制限に狭く焦点を合わせる代わりに、より広い視点を取りました。人間の脳のようなインテリジェントなシステムは、記憶を管理し、新しい状況に適応するのですか？それは、1つの巨大で、絶えずアクセス可能なメモリを持つことではありません。これは、より柔軟なセットアップであり、さまざまなコンポーネントがさまざまな種類の情報や経験を処理するために調整されます。

The Titans architecture (Behrouz et al., 2025)⁶, named for the mythological beings known for their wisdom and adaptability, embraces this, built not around a single, monolithic attention block but around a cooperative team of specialized memory systems.

Titans Architecture（Behrouz et al。、2025）は、知恵と適応性で知られている神話の存在にちなんで名付けられ、これを受け入れ、単一のモノリシックな注意ブロックの周りではなく、専門のメモリシステムの協同チームの周りに構築されています。

Each memory module in Titans plays a crucial role in understanding and responding to the task at hand. The spatial memory module (PM) stores a set of parameters that are prepended to the input sequence. These parameters are learned during training and act like a “Holy Grail” for the model to adhere to.

タイタンの各メモリモジュールは、手元のタスクを理解し、応答する上で重要な役割を果たします。空間メモリモジュール（PM）は、入力シーケンスに加えられた一連のパラメーターを保存します。これらのパラメーターは、トレーニング中に学習され、モデルが順守するための「聖杯」のように機能します。

The spatial memory module (PM) stores a set of parameters that are prepended to the input sequence. These parameters are learned during training and act like a “Holy Grail” for the model to adhere to.

空間メモリモジュール（PM）は、入力シーケンスに加えられた一連のパラメーターを保存します。これらのパラメーターは、トレーニング中に学習され、モデルが順守するための「聖杯」のように機能します。

The researchers chose to implement the LMM using a simple multi-layer perceptron (MLP) network, which takes the output of the standard self-attention module (STM) at time step t, denoted as yt, as input.

研究者は、単純なマルチレイヤーパーセプトロン（MLP）ネットワークを使用してLMMを実装することを選択しました。これは、入力としてYTとして示される時間ステップTで標準の自己触媒モジュール（STM）の出力を取得します。

免責事項:info@kdj.com

提供される情報は取引に関するアドバイスではありません。 kdj.com は、この記事で提供される情報に基づいて行われた投資に対して一切の責任を負いません。暗号通貨は変動性が高いため、十分な調査を行った上で慎重に投資することを強くお勧めします。

このウェブサイトで使用されているコンテンツが著作権を侵害していると思われる場合は、直ちに当社 (info@kdj.com) までご連絡ください。速やかに削除させていただきます。

2025年06月14日に掲載されたその他の記事

もっと