Subnet 38
Distributed Training Subnet
Distributed Training Subnet
This subnet uses a distributed approach to train Large Language Models

SN38 : Distributed training
Subnet | Description | Category | Company |
---|---|---|---|
SN38 : Distributed training | Distributed training | Decentralized Training |
The EdgeMaxxing subnet created by WOMBO.ai is dedicated to developing the world’s most optimized AI models for consumer devices, with a focus on starting with Stable Diffusion XL on the NVIDIA GeForce RTX 4090. As they continue to innovate, they plan to expand support for optimizing various end devices, models, and modalities.
Wombo envisions a future where artificial intelligence is decentralized, democratized, and accessible to everyone. They see this vision coming to life through a global supercomputer made up of individual user devices—laptops, gaming rigs, and smartphones. By tapping into the unused potential of these devices, WOMBO is working to create a vast decentralized network of computing power, making advanced AI technologies accessible to all. This approach aims to foster a thriving ecosystem of AI applications, driving innovation and ensuring that the benefits of AI are shared by all of humanity.
Optimizing AI models is a crucial step toward realizing the vision of decentralized AI.
- Accessibility: They are making advanced AI models capable of running on consumer devices like smartphones and laptops, bringing AI technology within everyone’s reach.
- Decentralization: By enabling millions of users to contribute their computing power, they move away from relying on a few powerful miners, creating a truly distributed AI network.
Through the optimization of popular models like LLAMA3 and Stable Diffusion, they transform idle computing resources into valuable assets for a global AI network. This approach not only democratizes AI usage and creation but also provides earning opportunities for millions.
The EdgeMaxxing subnet focuses on optimizing specific models, pipelines, and target hardware. Miners and validators work together in daily competitions to enhance AI model performance on consumer devices. Miners are rewarded based on their ranking, with greater rewards for more effective optimizations.
Scoring Process
1. Submission Collection:
- Validators gather all miner submissions daily at 12 PM New York time.
2. Benchmarking:
- Each submission is evaluated against a baseline checkpoint.
- The benchmarking process assesses the model’s speed, accuracy, and overall efficiency improvements.
3. Comparison:
- Speed Improvements: Measures how much faster the optimized model is compared to the baseline.
- Accuracy Maintenance: Ensures that the model’s accuracy remains consistent without significant degradation.
- Overall Efficiency Gains: Evaluates the combined impact of speed and accuracy enhancements.
4. Scoring:
If the comparison fails (e.g., the model is slower or less accurate than the baseline), the miner is assigned a score of 0.0.
If the comparison is successful, the miner’s score is determined by the difference in average time and similarity to the baseline.
The score is adjusted with a sequence ratio to ensure fair reward distribution.
5. Reward Distribution
- 1st Place: Receives 80% of the total reward pool.
- 2nd Place: Receives 16% of the total reward pool.
- 3rd Place: Receives 3.2% of the total reward pool.
- All Other Participants: Share the remaining portion of the pool.
Optimization Proposals
When optimizing machine learning models for edge devices, several effective techniques can significantly enhance performance. Here are some key approaches to consider:
Knowledge Distillation: This technique involves training a smaller, more efficient model to replicate the behavior of a larger, more complex one. It’s particularly useful for deploying models on devices with limited computational resources, where maintaining performance is crucial despite reduced model size.
Quantization: This method reduces the precision of a model’s weights and activations, typically from 32-bit floating-point to 8-bit integers. This reduction lowers memory usage and computational demands, making it feasible to run models on edge devices. Additionally, using low-precision representations for weights, such as 8-bit integers, can decrease memory bandwidth usage for memory-bound models, even when computations are performed in higher precision, like 32-bit.
TensorRT and Hardware-Specific Optimizations: Leveraging NVIDIA’s TensorRT can optimize deep learning models for inference on NVIDIA GPUs. This process goes beyond simple layer fusion, incorporating assembly-level optimizations, identifying prefetch opportunities, optimizing L2 memory allocation, writing specialized kernels, and performing graph optimizations. These steps tailor the model to specific hardware configurations, enhancing performance and reducing latency.
Hyperparameter Tuning: Adjusting the model’s configuration settings can lead to significant performance improvements. This tuning can be done manually or through automated methods such as grid search or Bayesian optimization. While not a direct edge optimization, hyperparameter tuning is a critical step in the overall model optimization process.
Developers are encouraged to explore these optimization techniques or innovate new methods to enhance model performance and efficiency, particularly for edge devices.
WOMBO is recognized as one of the world’s leading consumer AI companies and an early advocate of generative AI. They’ve launched two #1 apps, WOMBO and Dream, which have collectively been downloaded over 200 million times and topped app store charts in over 100 countries.
These achievements have been made possible by harnessing the immense capabilities of cutting-edge generative AI techniques and the power of open-source AI. WOMBO’s unique perspective on this research, viewed through the lens of consumer entertainment, allows them to create products that people love to use and share.
They are just at the beginning of the Synthetic Media Revolution, a movement set to transform how people create, consume, and distribute content. WOMBO is building the apps and infrastructure to drive this change and bring the potential of AI-powered entertainment to the masses.
Ben-Zion Benkhin – Creator of WOMBO
Vivek Bhakta – Co-Founder
Parshant Loungani – Co-Founder & Head of AI
Angad Arneja – Co-Founder and Strategic Advisor
Salman Shahid – Machine Learning Engineer
Talha A. – Head of Product
Samantha Pierre – Data Lead and Analytics Engineer
Abhinav Ramana – Senior Software Engineer
Maxime Peabody – Senior Software Engineer
Derek Lance – Software Engineer
Kevin Brace – Social Media
Devesh Shetty – Founding Engineer
Nir Kabessa – WOMBO Advisor