The OMEGA Labs Bittensor subnet, is an innovative initiative focused on creating the world’s largest decentralized multimodal dataset to advance research and development in Artificial General Intelligence (AGI). Their mission is to democratize access to an extensive and diverse dataset that encompasses human knowledge and creativity, empowering researchers and developers to push the boundaries of AGI.

Harnessing the Bittensor network and a global community of miners and validators, they are constructing a dataset that exceeds the scale and diversity of existing resources. Featuring over 1 million hours of footage and 30 million+ 2-minute video clips, the OMEGA Labs dataset will facilitate the development of robust AGI models and drive transformation across multiple industries.

Key Features

  • Unmatched Scale and Variety: Over 1 million hours of footage and 30 million video clips spanning 50+ scenarios and 15,000+ action phrases.
  • Latent Representations: Utilizing cutting-edge models to translate video components into a unified latent space for efficient processing.
  • Incentivized Data Collection: Rewarding miners for contributing high-quality, diverse, and innovative videos through a decentralized network.
  • Empowering Digital Agents: Facilitating the development of intelligent agents capable of navigating complex workflows and supporting users across platforms.
  • Immersive Gaming Experiences: Enabling the creation of realistic gaming environments with rich physics and interactive elements.

Miner

Conducts searches on YouTube and retrieves up to 8 videos per query. Specifies a clip range (up to 2 minutes) and provides a description (catch) including video title, tags, and description. Obtains ImageBind embeddings for video, audio, and caption components. Returns video ID, caption, ImageBind embeddings (video, audio, caption), and start and end times for clips (up to 2 minutes).

Validator

Randomly selects one video from those submitted by miners for validation. Calculates ImageBind embeddings for all modalities (video, audio, caption) of the selected video. Compares embeddings to ensure consistency with miner submissions. If validated, assumes all eight videos from the miner are valid. Scores videos based on relevance, novelty, and detail richness:

Detail Richness: Determines similarity between text and video embeddings. Collects 1024 validated video entries and submits them as a concatenated file to Hugging Face. Adjusts file accumulation limits if a miner submits too frequently. Submits remaining validated entries in case of API shutdowns.

Relevance: Uses cosine similarity between topic embeddings and each of the eight videos.

Novelty: Calculates 1 – similarity to the closest video in the Pinecone index.

Ben-Zion Benkhin – Founder and CEO

Ben founded WOMBO in 2020, aiming to simplify cutting-edge technology for everyday use.

Salman Shahid – Machine Learning Engineer

Salman, fascinated by autonomous AI since a young age, viewed AI as a tool for exploring groundbreaking concepts.

Parshant Loungani – Founder and Head of AI

Parshant, with a physics background, transitioned into AI due to its potential for innovation, leading to the creation of the successful WOMBO app.