Bittensor Subnet 38:Distributed training

SN38 PROFILE

Subnet 38

Distributed Training Subnet

This subnet uses a distributed approach to train Large Language Models

SN38 : Distributed training

Subnet	Description	Category	Company
SN38 : Distributed training	Distributed training	Decentralized Training

Links

X : N/A
Website : N/A
Litepaper
Github

Latest News

Kraken Becomes First Tier 1 Exchange to Validate Bittensor, Boosting Decentralized AIApril 5, 2025
Tensorplex Labs Launches Dojo Synthetic API to Revolutionize Decentralized Synthetic Data GenerationApril 5, 2025
Introducing wTAO on Solana with VoidAIApril 5, 2025

Subnet navigation

SN1 SN2 SN3 SN4 SN5 SN6 SN7 SN8 SN9 SN10 SN11 SN12 SN13 SN14 SN15 SN16 SN17 SN18 SN19 SN20 SN21 SN22 SN23 SN24 SN25 SN26 SN27 SN28 SN29 SN30 SN31 SN32 SN33 SN34 SN35 SN36 SN37 SN38 SN39 SN40 SN41 SN42 SN43 SN44 SN45 SN46 SN47 SN48 SN49 SN50 SN51 SN52 SN53 SN54 SN55 SN56 SN57 SN58 SN59 SN60 SN61 SN62 SN63 SN64

The EdgeMaxxing subnet created by WOMBO.ai is dedicated to developing the world’s most optimized AI models for consumer devices, with a focus on starting with Stable Diffusion XL on the NVIDIA GeForce RTX 4090. As they continue to innovate, they plan to expand support for optimizing various end devices, models, and modalities.

Wombo envisions a future where artificial intelligence is decentralized, democratized, and accessible to everyone. They see this vision coming to life through a global supercomputer made up of individual user devices—laptops, gaming rigs, and smartphones. By tapping into the unused potential of these devices, WOMBO is working to create a vast decentralized network of computing power, making advanced AI technologies accessible to all. This approach aims to foster a thriving ecosystem of AI applications, driving innovation and ensuring that the benefits of AI are shared by all of humanity.

Optimizing AI models is a crucial step toward realizing the vision of decentralized AI.

Accessibility: They are making advanced AI models capable of running on consumer devices like smartphones and laptops, bringing AI technology within everyone’s reach.
Decentralization: By enabling millions of users to contribute their computing power, they move away from relying on a few powerful miners, creating a truly distributed AI network.

Through the optimization of popular models like LLAMA3 and Stable Diffusion, they transform idle computing resources into valuable assets for a global AI network. This approach not only democratizes AI usage and creation but also provides earning opportunities for millions.

The EdgeMaxxing subnet focuses on optimizing specific models, pipelines, and target hardware. Miners and validators work together in daily competitions to enhance AI model performance on consumer devices. Miners are rewarded based on their ranking, with greater rewards for more effective optimizations.

Scoring Process

1. Submission Collection:

Validators gather all miner submissions daily at 12 PM New York time.

2. Benchmarking:

Each submission is evaluated against a baseline checkpoint.
The benchmarking process assesses the model’s speed, accuracy, and overall efficiency improvements.

3. Comparison:

Speed Improvements: Measures how much faster the optimized model is compared to the baseline.
Accuracy Maintenance: Ensures that the model’s accuracy remains consistent without significant degradation.
Overall Efficiency Gains: Evaluates the combined impact of speed and accuracy enhancements.

4. Scoring:

If the comparison fails (e.g., the model is slower or less accurate than the baseline), the miner is assigned a score of 0.0.
If the comparison is successful, the miner’s score is determined by the difference in average time and similarity to the baseline.
The score is adjusted with a sequence ratio to ensure fair reward distribution.

5. Reward Distribution

1st Place: Receives 80% of the total reward pool.
2nd Place: Receives 16% of the total reward pool.
3rd Place: Receives 3.2% of the total reward pool.
All Other Participants: Share the remaining portion of the pool.

Optimization Proposals

When optimizing machine learning models for edge devices, several effective techniques can significantly enhance performance. Here are some key approaches to consider:

Knowledge Distillation: This technique involves training a smaller, more efficient model to replicate the behavior of a larger, more complex one. It’s particularly useful for deploying models on devices with limited computational resources, where maintaining performance is crucial despite reduced model size.

Quantization: This method reduces the precision of a model’s weights and activations, typically from 32-bit floating-point to 8-bit integers. This reduction lowers memory usage and computational demands, making it feasible to run models on edge devices. Additionally, using low-precision representations for weights, such as 8-bit integers, can decrease memory bandwidth usage for memory-bound models, even when computations are performed in higher precision, like 32-bit.

TensorRT and Hardware-Specific Optimizations: Leveraging NVIDIA’s TensorRT can optimize deep learning models for inference on NVIDIA GPUs. This process goes beyond simple layer fusion, incorporating assembly-level optimizations, identifying prefetch opportunities, optimizing L2 memory allocation, writing specialized kernels, and performing graph optimizations. These steps tailor the model to specific hardware configurations, enhancing performance and reducing latency.

Hyperparameter Tuning: Adjusting the model’s configuration settings can lead to significant performance improvements. This tuning can be done manually or through automated methods such as grid search or Bayesian optimization. While not a direct edge optimization, hyperparameter tuning is a critical step in the overall model optimization process.

Developers are encouraged to explore these optimization techniques or innovate new methods to enhance model performance and efficiency, particularly for edge devices.

WOMBO is recognized as one of the world’s leading consumer AI companies and an early advocate of generative AI. They’ve launched two #1 apps, WOMBO and Dream, which have collectively been downloaded over 200 million times and topped app store charts in over 100 countries.

These achievements have been made possible by harnessing the immense capabilities of cutting-edge generative AI techniques and the power of open-source AI. WOMBO’s unique perspective on this research, viewed through the lens of consumer entertainment, allows them to create products that people love to use and share.

They are just at the beginning of the Synthetic Media Revolution, a movement set to transform how people create, consume, and distribute content. WOMBO is building the apps and infrastructure to drive this change and bring the potential of AI-powered entertainment to the masses.

Ben-Zion Benkhin – Creator of WOMBO

Vivek Bhakta – Co-Founder

Parshant Loungani – Co-Founder & Head of AI

Angad Arneja – Co-Founder and Strategic Advisor

Salman Shahid – Machine Learning Engineer

Talha A. – Head of Product

Samantha Pierre – Data Lead and Analytics Engineer

Abhinav Ramana – Senior Software Engineer

Maxime Peabody – Senior Software Engineer

Derek Lance – Software Engineer

Kevin Brace – Social Media

Devesh Shetty – Founding Engineer

Nir Kabessa – WOMBO Advisor