$87959.907984 USD

1.34%

ethereum

$2920.497338 USD

3.04%

tether

$0.999775 USD

0.00%

xrp

$2.237324 USD

8.12%

bnb

$860.243768 USD

0.90%

solana

$138.089498 USD

5.43%

usd-coin

$0.999807 USD

0.01%

tron

$0.272801 USD

-1.53%

dogecoin

$0.150904 USD

2.96%

cardano

$0.421635 USD

1.97%

hyperliquid

$32.152445 USD

2.23%

bitcoin-cash

$533.301069 USD

-1.94%

chainlink

$12.953417 USD

2.68%

unus-sed-leo

$9.535951 USD

0.73%

zcash

$521.483386 USD

-2.87%

Cryptocurrency News Articles

OORT's Diverse Tools Kaggle data set climbs to the first page in multiple categories

May 14, 2025 at 08:18 pm

An artificial intelligence training image data set developed by decentralized AI solution provider OORT has seen considerable success on Google's platform Kaggle.

An artificial intelligence training image data set developed by decentralized AI solution provider OORT has seen considerable success on Google’s platform Kaggle.

OORT’s Diverse Tools Kaggle data set listing was released in early April and has since seen it climb to the first page in multiple categories. Kaggle is a Google-owned online platform for data science and machine learning competitions, learning and collaboration.

Ramkumar Subramaniam, core contributor at crypto AI project OpenLedger, recognized that a front-page Kaggle ranking is a strong social signal, indicating that the data set is engaging the right communities of data scientists, machine learning engineers and practitioners.

Max Li, founder and CEO of OORT, said that the firm observed promising engagement metrics that validate the early demand and relevance of its training data gathered through a decentralized model.

"We're grateful for the positive response from the Kaggle community," said Li. "This achievement reflects the hard work and dedication of our team in developing high-quality, diverse, and accessible AI training data."

OORT plans to release multiple data sets in the coming months. Among those is an in-car voice commands data set, one for smart home voice commands and another for deepfake videos meant to improve AI-powered media verification.

The data set in question was independently verified to have reached the first page in Kaggle’s General AI, Retail & Shopping, Manufacturing, and Engineering categories earlier this month. At the time of publication, it lost those positions following a possibly unrelated data set update on May 6 and another on May 14.

Recognizing the achievement, Subramaniam said that it’s not a definitive indicator of real-world adoption or enterprise-grade quality.

What sets OORT’s data set apart is not just the ranking, but the provenance and incentive layer behind the data set.

"In a world where image scarcity and poisoning techniques are increasing, verifiable and community-sourced/incentivized data sets become more valuable than ever," said Subramaniam. "Such projects can become not just alternatives, but pillars of AI alignment and provenance in the data economy."

Data published by AI research firm Epoch AI estimates that human-generated text AI training data will be exhausted in 2028. The pressure is high enough that investors are now mediating deals granting rights to copyrighted materials to AI companies.

Reports concerning increasingly scarce AI training data and how it may limit growth in the space have been circulating for years. While synthetic (AI-generated) data is increasingly used with at least some degree of success, human data is still largely viewed as the better alternative, higher-quality data that leads to better AI models.

When it comes to images for AI training specifically, things are becoming increasingly complicated with artists purposely sabotaging training efforts to protect their images from being used for AI training without permission.

One such project, Nightshade, allows users to "poison" their images and severely degrade model performance.

"We're entering an era where high-quality image data will become increasingly scarce, and in this situation, verifiable and community-sourced/incentivized data sets like OORT's are more valuable than ever," said Subramaniam.

In this case, the OORT data set is a collection of diverse images from various domains, including food, fashion, architecture, technology, and art, which are released under a CC BY-4.0 license and collected via a tokenized crowdsourcing campaign.

The project aims to provide a balanced and comprehensive data set that can be used to train image recognition models for various tasks, such as object detection, image segmentation, and image generation.

The initiative was funded through a token offering in early 2021, and saw participation from members of the blockchain community, who provided image contributions in exchange for OORT tokens.

The project's Devotees collected and formatted the images, and they were finally released on Kaggle in early April. It reached the first page in multiple categories within a month of release.

The OORT data set has also been recognized by leading AI and blockchain publications and websites, further highlighting its significance and innovation.

This content is not financial advice and does not necessarily represent the views of CCNR and should not be viewed as an endorsement.

Original source：tradingview

Disclaimer:info@kdj.com

The information provided is not trading advice. kdj.com does not assume any responsibility for any investments made based on the information provided in this article. Cryptocurrencies are highly volatile and it is highly recommended that you invest with caution after thorough research！

If you believe that the content used on this website infringes your copyright, please contact us immediately (info@kdj.com) and we will delete it promptly.

Other articles published on Jul 27, 2026