Bittensor Subnet 13: Data Universe

SN13 PROFILE

Subnet 13

Data Universe

Macrocosmos

Data Universe decentralizes and scales data storage for Bittensor, supporting extensive data collection and distribution

SN13 : Data Universe

Subnet	Description	Category	Company
SN13 : Data Universe	Data scraping & storage	Data Pipeline Storage	Macrocosmos

Links

Latest News

Bittensor Mainnet Major Upgrade: Enhancing Network Efficiency and CompetitivenessSeptember 28, 2025
BitMind Rolls Out Privacy-Focused Mobile Tool to Fight Deepfake ScamsSeptember 26, 2025
Sportstensor Partners with Polymarket: The Future of Decentralized Sports Prediction MarketsSeptember 15, 2025

Subnet navigation

SN1 SN2 SN3 SN4 SN5 SN6 SN7 SN8 SN9 SN10 SN11 SN12 SN13 SN14 SN15 SN16 SN17 SN18 SN19 SN20 SN21 SN22 SN23 SN24 SN25 SN26 SN27 SN28 SN29 SN30 SN31 SN32 SN33 SN34 SN35 SN36 SN37 SN38 SN39 SN40 SN41 SN42 SN43 SN44 SN45 SN46 SN47 SN48 SN49 SN50 SN51 SN52 SN53 SN54 SN55 SN56 SN57 SN58 SN59 SN60 SN61 SN62 SN63 SN64

Data is a crucial pillar of AI, and Data Universe serves as that pillar for Bittensor.

Data Universe is a subnet designed for collecting and storing vast amounts of data from a wide range of sources, intended for use by other subnets. It was built with a strong emphasis on decentralization and scalability. There is no centralized entity controlling the data; it is distributed across all miners on the network and can be queried via the validators. At launch, Data Universe supports up to 50 petabytes of data across 200 miners, while only requiring approximately 10GB of storage on each validator.

Macrocosmos aims to elevate the creation of subnets, emphasizing a focus on crafting incentives and mechanisms for the Bittensor network. In the Data Universe, miners scrape data from defined sources, known as DataSources. Each piece of data (e.g., a webpage, BTC prices), termed a DataEntity, is stored in the miner’s database. Every DataEntity belongs to a specific DataEntityBucket, uniquely identified by its DataEntityBucketId—a tuple consisting of the data’s source (DataSource), creation time (TimeBucket), and a classification (DataLabel, e.g., a stock ticker symbol). The complete set of DataEntityBuckets on a miner is called its MinerIndex.

Validators periodically query each miner to retrieve their latest MinerIndexes and store them in a local database. This process provides validators with a comprehensive overview of all data stored on the network and identifies which miners to query for specific types of data. Validators also regularly verify the accuracy of the data stored by miners and reward them based on the value of the data they have accumulated.

Incentive Mechanism

Each miner reports its MinerIndex to the validator, detailing the quantity and type of data it holds. Miners are scored based on two main dimensions:

Data Quantity and Value: The volume and the value of the data a miner has.
Miner Credibility: The reliability of the miner.

Data Value

Not all data holds the same value. The factors determining data value include:

Data Freshness: Fresh data is more valuable than old data. Data older than a certain threshold is not scored. As of December 11th, 2023, data older than 30 days is not scored, though this threshold may change in the future.
Data Desirability: The Data Universe defines a DataDesirabilityLookup to determine which types of data are more desirable. Desirable data is scored more highly. Unspecified labels get a default_scale_factor of 0.5, meaning they score half the value compared to specified labels. The DataDesirabilityLookup will evolve over time, with each change announced in advance to allow miners time to adjust.
Duplication Factor: Data stored by many miners is less valuable than data stored by only a few. The value of data decreases in proportion to the number of miners storing it.

Miner Credibility

Validators periodically check a sample of data from each miner’s MinerIndex to verify its accuracy. This process helps track a miner’s credibility, which in turn scales the miner’s score. Misrepresenting data types and quantities always results in a worse score for the miner.

Data Universe Dashboard

Data Universe rewards diversity of data; storing multiple copies of the same data is not beneficial. To help miners understand the current data landscape, the Data Universe team hosts a dashboard showing the amount of each type of data (by DataEntityBucketId) on the subnet. Miners are encouraged to use this dashboard to optimise their Miner Configuration and maximise rewards.

Will Squires – CEO and Co-Founder

Will has dedicated his career to navigating complexity, spanning from designing and constructing significant infrastructure to spearheading the establishment of an AI accelerator. With a background in engineering, he made notable contributions to transport projects such as Crossrail and HS2. Will’s expertise led to an invitation to serve on the Mayor of London’s infrastructure advisory panel and to lecture at UCL’s Centre for Advanced Spatial Analysis (CASA). He was appointed by AtkinsRéalis to develop an AI accelerator, which expanded to encompass over 60 staff members globally. At XYZ Reality, a company specializing in augmented reality headsets, Will played a pivotal role in product and software development, focusing on holographic technology. Since 2023, Will has provided advisory services for the Opentensor Foundation, contributing to the launch of Revolution.

Steffen Cruz – CTO and Co-Founder

Steffen earned his PhD in subatomic physics from the University of British Columbia, Canada, focusing on developing software to enhance the detection of extremely rare events (10^-7). His groundbreaking research contributed to the identification of novel exotic states of nuclear matter and has been published in prestigious scientific journals. As the founding engineer of SolidState AI, he pioneered innovative techniques for physics-informed machine learning (PIML). Steffen was subsequently appointed as the Chief Technology Officer of the Opentensor Foundation, where he played a pivotal role as a core developer of Subnet 1, the foundation’s flagship subnet. In this capacity, he enhanced the adoption and accessibility of Bittensor by authoring technical documentation, tutorials, and collaborating on the development of the subnet template.

Pedro Ferreira – Machine Learning Engineer

Kalei Brady – Data Scientist

Sergio Champoux – Data Scientist

Brian McCrindle – Machine Learning Researcher

Elena Nesterova – Lead Technical Program Manager

Richard Hudson – Communications Lead

Alex Williams – Recruitment Lead