$98777.721712 USD

2.53%

ethereum

$1860.886838 USD

2.21%

tether

$1.000198 USD

0.01%

xrp

$2.171331 USD

1.39%

bnb

$608.064054 USD

1.06%

solana

$150.182992 USD

2.92%

usd-coin

$1.000135 USD

0.01%

dogecoin

$0.177773 USD

4.19%

cardano

$0.701641 USD

4.02%

tron

$0.249462 USD

2.11%

sui

$3.587954 USD

6.89%

chainlink

$14.328735 USD

3.42%

avalanche

$20.069571 USD

1.40%

stellar

$0.267019 USD

2.34%

unus-sed-leo

$8.829380 USD

1.23%

暗号通貨のニュース記事

ブラウン大学の研究者は、ロボットとアニメーションの数字の動きを生み出すことができるAIモデルを開発します

2025/05/09 03:08

MotionGlotと呼ばれるモデルは、ユーザーが単純にアクションを入力することを可能にします - 「いくつかのステップを歩いて右に進む」 - モデルは、そのモーションの正確な表現を生成して、ロボットまたはアニメーションのアバターを指揮することができます。

Researchers at Brown University have developed an artificial intelligence model that can generate movement in robots and animated figures in much the same way that AI models like ChatGPT generate text.

ブラウン大学の研究者は、ChatGPTのようなAIモデルがテキストを生成するのとほぼ同じ方法で、ロボットとアニメーションの数字の動きを生成できる人工知能モデルを開発しました。

The model, called MotionGlot, enables users to simply type an action — “walk forward a few steps and take a right”— and the model can generate accurate representations of that motion to command a robot or animated avatar.

The model’s key advance, according to the researchers, is its ability to “translate” motion across robot and figure types, from humanoids to quadrupeds and beyond. That enables the generation of motion for a wide range of robotic embodiments and in all kinds of spatial configurations and contexts.

研究者によると、モデルの重要な進歩は、ヒューマノイドから四足動物まで、ロボットとフィギュアのタイプ全体で動きを「翻訳」する能力です。これにより、さまざまなロボットの実施形態や、あらゆる種類の空間構成とコンテキストでの動きの生成が可能になります。

“We’re treating motion as simply another language,” said Sudarshan Harithas, a Ph.D. student in computer science at Brown, who led the work. “And just as we can translate languages — from English to Chinese, for example — we can now translate language-based commands to corresponding actions across multiple embodiments. That enables a broad set of new applications.”

「私たちは動きを単に別の言語として扱っています」と博士号のSudarshan Harithasは言いました。ブラウンのコンピューターサイエンスの学生は、作品を率いていました。「そして、たとえば、英語から中国語に至る言語を翻訳できるように、言語ベースのコマンドを複数の実施形態にわたって対応するアクションに翻訳できるようになりました。これにより、幅広い新しいアプリケーションが可能になります。」

The research, which was supported by the Office of Naval Research, will be presented later this month at the 2025 International Conference on Robotics and Automation in Atlanta. The work was co-authored by Harithas and his advisor, Srinath Sridhar, an assistant professor of computer science at Brown.

海軍研究局によってサポートされていた研究は、今月後半にアトランタで開催されたロボットと自動化に関する2025年の国際会議で発表されます。この作品は、Brownのコンピューターサイエンスの助教授であるHarithasと彼のアドバイザーであるSrinath Sridharによって共著されました。

Large language models like ChatGPT generate text through a process called “next token prediction,” which breaks language down into a series of tokens, or small chunks, like individual words or characters. Given a single token or a string of tokens, the language model makes a prediction about what the next token might be. These models have been incredibly successful in generating text, and researchers have begun using similar approaches for motion. The idea is to break down the components of motion— the discrete position of legs during the process of walking, for example — into tokens. Once the motion is tokenized, fluid movements can be generated through next token prediction.

ChatGptのような大規模な言語モデルは、「Next Token Prediction」と呼ばれるプロセスを使用してテキストを生成します。これは、言語を一連のトークン、または個々の単語やキャラクターのような小さなチャンクに分解します。単一のトークンまたは一連のトークンが与えられた場合、言語モデルは次のトークンが何であるかについて予測します。これらのモデルはテキストの生成に非常に成功しており、研究者は動きに同様のアプローチを使用し始めています。アイデアは、たとえば歩行の過程での脚の離散位置である運動の成分を、トークンに分解することです。動きがトークン化されると、次のトークン予測を通じて流体の動きを生成できます。

One challenge with this approach is that motions for one body type can look very different for another. For example, when a person is walking a dog down the street, the person and the dog are both doing something called “walking,” but their actual motions are very different. One is upright on two legs; the other is on all fours. According to Harithas, MotionGlot can translate the meaning of walking from one embodiment to another. So a user commanding a figure to “walk forward in a straight line” will get the correct motion output whether they happen to be commanding a humanoid figure or a robot dog.

このアプローチでの課題の1つは、あるボディタイプの動きが別のボディタイプでは非常に異なって見えることです。たとえば、人が犬を通りを歩いているとき、人と犬はどちらも「ウォーキング」と呼ばれることをしていますが、実際の動きは非常に異なります。 1つは2本の足に直立しています。もう1つはすべて四つんでいます。 Harithasによると、MotionGlotは、ある具体化から別の具体化に歩くことの意味を翻訳できます。したがって、「直線で前進する」ようにフィギュアを指揮するユーザーは、ヒューマノイドの姿であろうとロボット犬の命令であろうと、正しいモーション出力を取得します。

To train their model, the researchers used two datasets, each containing hours of annotated motion data. QUAD-LOCO features dog-like quadruped robots performing a variety of actions along with rich text describing those movements. A similar dataset called QUES-CAP contains real human movement, along with detailed captions and annotations appropriate to each movement.

モデルをトレーニングするために、研究者は2つのデータセットを使用しました。それぞれに注釈付きモーションデータの時間が含まれていました。 Quad-Locoは、さまざまなアクションを実行する犬のような四足込められたロボットと、それらの動きを説明する豊富なテキストを特徴としています。 Ques-Capと呼ばれる同様のデータセットには、実際の人間の動きと、各動きに適した詳細なキャプションと注釈が含まれています。

Using that training data, the model reliably generates appropriate actions from text prompts, even actions it has never specifically seen before. In testing, the model was able to recreate specific instructions, like “a robot walks backwards, turns left and walks forward,” as well as more abstract prompts like “a robot walks happily.” It can even use motion to answer questions. When asked “Can you show me movement in cardio activity?” the model generates a person jogging.

そのトレーニングデータを使用して、モデルはテキストプロンプトから適切なアクションを確実に生成します。テストでは、モデルは「ロボットが後方に歩き、左に曲がって前進する」などの特定の指示を再作成することができました。動きを使用して質問に答えることもできます。「有酸素運動の動きを見せてくれませんか？」と尋ねられたときこのモデルは、ジョギングを生成します。

“These models work best when they’re trained on lots and lots of data,” Sridhar said. “If we could collect large-scale data, the model can be easily scaled up.”

「これらのモデルは、たくさんのデータでトレーニングされているときに最適に機能します」とSridhar氏は言います。「大規模なデータを収集できれば、モデルを簡単に拡大できます。」

The model’s current functionality and the adaptability across embodiments make for promising applications in human-robot collaboration, gaming and virtual reality, and digital animation and video production, the researchers say. They plan to make the model and its source code publicly available so other researchers can use it and expand on it.

モデルの現在の機能と実施形態全体での適応性は、人間のロボットコラボレーション、ゲームと仮想現実、デジタルアニメーションとビデオ制作における有望なアプリケーションになります、と研究者は言います。彼らは、他の研究者がそれを使用してそれを拡張できるように、モデルとそのソースコードを公開することを計画しています。

免責事項:info@kdj.com

提供される情報は取引に関するアドバイスではありません。 kdj.com は、この記事で提供される情報に基づいて行われた投資に対して一切の責任を負いません。暗号通貨は変動性が高いため、十分な調査を行った上で慎重に投資することを強くお勧めします。

このウェブサイトで使用されているコンテンツが著作権を侵害していると思われる場合は、直ちに当社 (info@kdj.com) までご連絡ください。速やかに削除させていただきます。

2025年05月09日に掲載されたその他の記事

もっと