Subnet 57

Gaia

Gaia

Gaia bridges expert models and foundational geospatial intelligence, leveraging Bittensor’s decentralised, scalable framework for open collaboration.

SN57 : Gaia

SN57 : Gaia

SubnetDescriptionCategoryCompany
SN57 : GaiaGeospatial forecastingPredictive systems
Nickel5

Gaia is an innovative platform designed to bridge the gap between small expert models and a comprehensive foundational geospatial model. Gaia leverages open-source collaboration and distributed computing to build a scalable, modular system. By encouraging community-driven contributions and using a decentralised architecture, they aim to evolve continuously, expanding Gaia’s capabilities and utility over time.

The vision for Gaia is ambitious yet practical, recognising the need for distributed contribution at scale and working to establish a foundational framework that drives progress. Through this, they are paving the way for a new era of geospatial intelligence, empowering industries, governments, and individuals with the tools needed to unlock the full potential of geospatial data.

In the fast-moving field of machine learning, reinventing the wheel is neither practical nor efficient. As members of the Bittensor community, they recognise Bittensor as a revolutionary platform that incentivises open-source, distributed computation and research. With its proven track record and unique infrastructure, Bittensor provides the ideal foundation for Gaia.

Bittensor’s decentralised network delivers the essential incentive structure and computational resources to support Gaia’s ambitious goals. By tapping into this existing framework, Gaia can avoid redundant efforts and focus on advancing geospatial data analysis. Among all available options, Bittensor stands out as the clear choice—no other platform offers such a robust foundation for fostering open collaboration and scalability.

They’ve conceptualised Gaia as a mixture-of-experts (MoE) platform, utilising a cutting-edge machine learning approach that prioritises modularity and specialisation. Traditionally, MoE models consist of tightly coupled components, where a gating layer dynamically selects expert layers to solve specific tasks. This integrated design allows the scaling of model parameters, which in turn boosts accuracy, all while keeping inference costs relatively constant. Many of the leading commercial large language models (LLMs) have incorporated MoE principles to enhance performance.

However, decentralising large-scale applications across many worker nodes introduces significant inefficiencies, particularly in training and inference. These challenges arise from the inherent complexity of distributed systems, which lack the streamlined efficiency found in centralised architectures. While there are still breakthroughs to be made in this area, Gaia takes a pragmatic approach by reimagining the MoE architecture.

Instead of sticking with a tightly integrated system, Gaia employs a decoupled architecture that separates the gating network layer from the expert models. While this trade-off sacrifices some flexibility and efficiency, it significantly reduces the challenges of training massive models across distributed nodes. By decoupling the components, they achieve the following advantages:

Selective Specialisation

They can target specific tasks and use tailored expert models, which helps avoid the need for exhaustive end-to-end retraining.

Iterative Improvement

Expert models can be independently improved, updated, or replaced without the need to adjust the overarching architecture.

Enhanced Modularity

They can introduce new expert models or combinations over time, further extending Gaia’s capabilities without causing major disruptions.

Although this approach may lead to increased latency and reduced efficiency compared to traditional MoE systems, it aligns with Gaia’s vision of a scalable, distributed platform. Their primary focus is on generating actionable, high-quality outputs, rather than optimising for a centralised system. Gaia’s decoupled architecture enables continuous evolution, ensuring that the platform stays adaptable and robust as it expands to handle more complex geospatial tasks.

Core Components

Gaia relies on a modular and scalable framework, with two key components at its core: the Orchestrator and the Task Class. These components work together to enable efficient coordination, execution, and modularity, ensuring that Gaia can address a wide range of geospatial challenges.

Orchestrator

The Orchestrator acts as the central hub for managing data and coordinating computational nodes. Its main purpose is to streamline operations by reducing inter-validator communication and optimising task allocation. This allows validators to focus their resources on critical functions such as scoring miner outputs and preprocessing data.

Initially, the Orchestrator is designed to:

Coordinate Tasks: Assign tasks efficiently to the appropriate nodes.

Streamline Communication: Minimise overhead by reducing unnecessary node-to-node interactions.

Interface for Organic Tasks: Provide a mechanism for submitting and retrieving outputs for tasks requested by real users or external systems.

While the Orchestrator plays a pivotal role in Gaia’s early development, they have a long-term vision to decentralise this functionality. Over time, its responsibilities will be distributed among validators, ensuring a fully decentralised system.

Task Class

The Task Class defines the fundamental unit of work within Gaia. Each task is characterised by specific inputs, outputs, and execution processes, making it a highly modular component. Tasks can range from simple data gathering and preprocessing to advanced inference using specialised expert models.

Key Features of Tasks:

  • Inference-Driven: Many tasks involve inference steps using predefined expert models.
  • Modular Design: Tasks can be composed of one or more subtasks, arranged in a graph-like structure to handle dependencies efficiently.
  • Scoring Mechanisms: Each task includes well-defined scoring parameters to ensure output quality and reliability.

Synthetic and Organic Tasks:

  • Synthetic Tasks: These are designed for miner evaluation, training, or improvement and are crucial to the system’s self-optimisation.
  • Organic Tasks: These tasks are real-world requests from users, showcasing the practical applications of Gaia’s capabilities.

Illustrative Example: Agricultural Analysis Task

To demonstrate Gaia’s task structure, consider an Agricultural Analysis Task aimed at providing actionable insights for farmers planning their crop seasons. This task integrates data from multiple sources, including ground sensors, satellite imagery, and climate models, to deliver a detailed analysis and forecast for a specific region.

Goal: Provide an actionable, user-friendly report detailing agricultural conditions and forecasts for a defined area.

Subtasks:

  • Market Analysis: Scraping and summarising market data.
  • Crop Yield Forecasting:
    Soil Carbon Content Modelling: Supported by a cloud cover removal tool.
    Predictive Soil Moisture Modelling: Using satellite and ground sensor data.
  • Climate Modelling:
    Short-term runs from Graphcast models.
    Integrations with NWS climate data.

Dependency Structure:

The subtasks are organised in a tree-like hierarchy.

  • Leaf Nodes: Solve fundamental subtasks such as data preprocessing or individual model inferences (e.g., satellite soil moisture modelling).
  • Intermediate Nodes: Combine outputs from leaf nodes into higher-order insights.
  • Trunk Node: Aggregates all results into the final output delivered to the user.

This hierarchical design ensures that dependencies are addressed systematically, maintaining data integrity and compatibility across all layers.

Ensuring Data Integrity and Scalability

As Gaia tackles increasingly complex tasks, the importance of having strict input/output definitions becomes even more critical. Each expert model must meet rigorous standards to ensure smooth integration. While adaptive and dynamic models are part of their long-term roadmap, Gaia’s current focus is on building robust, carefully defined task hierarchies to lay a solid foundation.

Starting Tasks

  • Dst Index Prediction Task: Space weather affects various systems on Earth, such as satellite operations, GPS accuracy, and power grid stability. Geomagnetic storms, driven by solar activity, create currents in Earth’s magnetic field that disrupt technological systems. Accurate predictions of the Disturbance Storm Time (Dst) index, which measures geomagnetic activity, are essential for mitigating the impacts of space weather events.

In this task, miners are tasked with predicting the Dst index one hour ahead. They receive recent Dst data, build or apply base models to generate predictions with a minimum confidence interval of 0.70, and use historical data from the current month to ensure relevance. Validators then access real-time Dst values to assess the accuracy of miner predictions.

  • Soil Moisture Prediction Task: Soil moisture is a key measurement for understanding human-driven climate change, disaster risk assessment, and agricultural modelling. With data from the SMAP mission, Gaia aims to predict surface and rootzone soil moisture six hours in advance.

Miners receive data such as satellite bands and weather forecast information, and then build or modify models to predict soil moisture. Validators collect real-time data to assess miner predictions after three days.

Limitations

Geospatial data often experiences latency and temporal resolution limits, which complicate the task of providing timely feedback and scores to miners. For example, the SMAP data has a typical latency of three days, and soil moisture predictions may only be scored after that delay. Miners will receive weather forecasts and predict soil moisture six hours into the future, but the score for their predictions will not be available until a few days later.

Bounty/Incentive System

Given the ambitious scope of Gaia, a centralised approach to task creation and definition is neither practical nor desirable. Instead, they have envisioned a decentralised, community-driven system where contributors are incentivised to expand Gaia’s functionality. This is achieved through a bounty system funded by subnet owner emissions, creative adjustments to miner/validator rewards, or a combination of both.

Bounty Model

  • Tasks are defined as bounties, which encourages community contributions to create and refine base models, define inputs/outputs, and develop processing steps.
  • Contributors are rewarded in proportion to the utility of their submissions.

Royalty Mechanism

  • Contributors can receive ongoing rewards based on how often their task is called or its utility as a subtask within broader workflows.
  • This incentivises high-quality, reusable contributions.

Governance Integration

  • Using dTAO and smart contracts, contributors can submit task definitions as pull requests to the main repository.
  • A voting mechanism, similar to Bittensor’s, allows subnet token holders to approve or modify tasks.

Roles

Miners’ responsibilities align with traditional roles in decentralised systems, focusing on:

  • Improving model outputs and task efficiency.
  • Registering for tasks that align with their expertise or interests.
  • Contributing to specific research areas.

Validators ensure the integrity and reliability of Gaia by:

  • Providing miners with necessary datasets through API integrations.
  • Storing and scoring miner outputs using real-world ground truth data as it becomes available.

Demand-Based Incentive Pools

As the number of potential tasks increases, maintaining equitable support for all tasks becomes challenging. Popular or competitive tasks may attract a disproportionate share of resources, leaving less glamorous but critical tasks under-supported. To address this, they propose a market-based system of incentive pools:

Dynamic Incentives

  • Tasks that experience low miner participation will see their share of miner rewards increase until equilibrium is restored.
  • Conversely, tasks with an oversupply of miners will have their reward share adjusted downward.

Task Complexity Adjustment

  • Tasks will vary in difficulty and computational demand, requiring a nuanced approach to reward distribution.
  • Incentive adjustments will account for these variations to ensure fairness and sustainability.

This approach introduces complexity but ensures that all tasks are adequately supported. As a conceptual framework, it will evolve with community input and real-world feedback.

Awaiting Data