Subnet 32
It’s AI
It’s AI
Subnet 32 detects AI-generated content, providing tools to verify text authenticity and support diverse applications

SN32 : It’s AI
Subnet | Description | Category | Company |
---|---|---|---|
SN32 : It’s AI | Inference verification & optimization | Generative Al | Manifold |
Subnet 32 focuses on detecting AI-generated content amid the rapid growth of Large Language Models (LLMs), such as ChatGPT producing 100 billion words daily compared to humans’ 100 trillion. As AI-generated text becomes ubiquitous, accurately discerning its origin is increasingly crucial.
To address this challenge, they have developed a subnet that incentivizes distributed solutions for identifying LLM-generated content. This includes defining incentive mechanisms, validation processes, and establishing a baseline model for miners.
Subnet 32 offers a front end that can determine if text input is AI-generated or human-authored. This tool is valuable for verifying data authenticity, particularly given the rise of large language models in various applications. Aside from AI detection, Subnet 32’s capabilities extend to various fields. From aiding ML engineers in filtering data for model training to assisting educators in detecting AI-generated student work, the subnet provides versatile tools for diverse user needs.
This Subnet is crucial in several scenarios. In schools, teachers need to distinguish between student-completed assignments and those done by AI. Bloggers and social media users aim to maintain authentic comment sections, free from AI-generated spam. Companies rely on identifying genuine job applications over AI-generated ones. Additionally, in more critical contexts, this technology aids in detecting fraudulent emails from scammers.
The project team tested their system and confirmed its ability to accurately identify AI-written text approximately 85% of the time, with minimal errors in mislabeling human-written text as AI. This marks significant progress in ensuring that despite advancements in AI writing capabilities, authenticity remains preserved online.
Validation Mechanism and Miner Evolution
The subnet uses a clever method to secure the validation process. Validators modify a database of 18 million human-authored texts slightly, preventing miners from easily deriving responses solely from existing data. Bittensor faced challenges with miners manipulating responses from models in Subnet 1, leading to the exploitation of computational resources. Subnet 32, however, employs a strategy where validators alter human-generated texts, making it difficult for miners to train on predetermined text.
Early miners raised concerns that reward models using language models could limit miner evolution, as fine-tuning responses solely for the existing model could hinder progress. Subnet 32 tackles this by utilizing known truths (human or AI generated) rather than a scoring mechanism.
AI Detection
The subnet’s AI detection tool holds significance in educational settings, where it helps teachers identify AI-generated submissions, curbing academic dishonesty. Furthermore, it can filter out automated comments on social media, preventing attention-seeking bots from influencing authentic interactions. The team found the idea to use Bittensor for AI detection through one of the team members who had knowledge of the rapid growth of Bittensor and expertise in machine learning. The idea emerged from the need for a tool connected to AI that could be extended and have substantial resources, making it a valuable application in the Bittensor network. Initial discussions led to the focus on detecting whether text was written by a human, aligning with the increasing demand for such capabilities with the rise of large language models.
Text Authenticity
Utilizing probability likelihood calculations, the project assesses text authenticity by comparing generated text with target content to ensure AI model usage. By evaluating probabilities of word sequences, a probability value (PPL) is derived to determine if text is AI-generated. Using a threshold alone is insufficient since it lacks normalization and accuracy in determining the probability of text being AI-generated or human-written. The system compares actual and predicted words in sequences, assessing the likelihood that a word is AI-generated based on previous words through loss calculations.
Text is split into chunks, loss is calculated within each chunk, and this loss is used in a linear model to provide a normalized and robotic probability for the generated text.
Use of “No Lama” Tool
The subnet utilizes “No Lama” as an aggregator for large language models, optimizing them for faster performance. “No Lama” allows for easy utilization of over 30,000 optimized language models, making it a valuable tool for prompt generation.
Validator and Miner Roles
Validators generate text prompts and human reference data for miners, utilizing distinct data sets to ensure authenticity and prevent training off stored text.
Miners assess the generated text for authenticity by leveraging advanced AI models to discern AI-generated content from human-written text. To prevent miners from training off the open data set, text augmentations like misspellings and alterations are added to ensure uniqueness and prevent memorization. The subnet aims to motivate miners by offering a competitive platform where miners pursue improvements in the system to identify AI-generated text effectively. Baseline models are offered to miners, who can either use these models as is or improve upon them through machine learning techniques to enhance text identification accuracy.
While some team members still hold other jobs, most are dedicated full-time to building and enhancing the subnet. The team comprises individuals specializing in various areas such as data science, crypto, and business, each contributing their expertise to the project.