Bittensor Subnet 40:Chunking

SN40 PROFILE

Subnet 40

Chunking

VectorChat

This subnet advances RAG by developing smart chunking solutions for better data retrieval and use

SN40 : Chunking

Subnet	Description	Category	Company
SN40 : Chunking	Chunking & RAG	Data Pipeline	VectorChat

Links

Latest News

Coin Metrics Precog Subnet 55 Demonstrates Predictive Ability to Outperform Traditional Bitcoin Holding StrategiesApril 9, 2025
Nuance Launches Incentives on Bittensor Subnet 23 to Optimize Global Quality of DiscourseApril 8, 2025
Kraken Becomes First Tier 1 Exchange to Validate Bittensor, Boosting Decentralized AIApril 5, 2025

Subnet navigation

SN1 SN2 SN3 SN4 SN5 SN6 SN7 SN8 SN9 SN10 SN11 SN12 SN13 SN14 SN15 SN16 SN17 SN18 SN19 SN20 SN21 SN22 SN23 SN24 SN25 SN26 SN27 SN28 SN29 SN30 SN31 SN32 SN33 SN34 SN35 SN36 SN37 SN38 SN39 SN40 SN41 SN42 SN43 SN44 SN45 SN46 SN47 SN48 SN49 SN50 SN51 SN52 SN53 SN54 SN55 SN56 SN57 SN58 SN59 SN60 SN61 SN62 SN63 SN64

What exactly does it do?

This subnet is dedicated to advancing Retrieval-Augmented Generation (RAG) by encouraging the development and provision of advanced chunking solutions. Its goal is to create, host, and deploy a smart chunking system that enhances similarity within chunks and maximizes dissimilarity between chunks.

VectorChat is building out a vertically integrated solution, being both a consumer and leading provider of intelligent Retrieval-Augmented Generation, and consequently, creating the full demand loop for the Chunking subnet.

What exactly is the ‘product/build’?

Chunking involves breaking data into smaller, manageable “chunks” to facilitate processing and analysis. This method is crucial in natural language processing (NLP) and is especially beneficial for large language models (LLMs). Chunking can involve segmenting content such as dividing an article into sections, a screenplay into scenes, or a musical recording into movements.

Why Use Chunking?

For LLMs to deliver accurate responses, they must access relevant information. When the needed information exceeds the model’s training data, it must be included in the query to avoid inaccuracies. However, including the entire dataset with each query is impractical due to inference costs.

Chunking addresses this by dividing data into smaller, meaningful chunks, which are then converted into vectors and stored in a vector database. When a query is made, it is also converted into a vector, and the system retrieves the most relevant chunks from the database. This approach allows the model to process only the pertinent sections of text, significantly reducing the number of tokens processed per query.

Benefits of Chunking

Chunking makes querying more efficient and cost-effective by focusing on relevant text portions while managing resources effectively. It is a vital step for various machine learning tasks that involve large datasets, such as:

Retrieval-Augmented Generation (RAG): By maintaining a database of relevant documents, RAG provides LLMs with the necessary context for accurate query processing. Effective chunking ensures that only the most relevant texts are included, enhancing response quality.
Classification: Chunking helps in organizing texts into similar sections for better classification and labeling, improving task accuracy and efficiency.
Semantic Search: Improved chunking enhances semantic search algorithms, which focus on meaning rather than simple keyword matching, resulting in more accurate and reliable search results.

What is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG) is a framework that enhances the performance of a Generative AI application’s large language model (LLM) by supplying it with the most relevant and contextually important proprietary, private, or dynamic data. This architecture improves the model’s accuracy and effectiveness in performing various tasks.

Team Info

VectorChat is dedicated to delivering the most immersive conversational AI experience. Their upcoming platform, Toffee, utilizes Retrieval-Augmented Generation (RAG) to provide users with expansive memory, extended conversation lengths, and domain-specific knowledge.

In developing Toffee, they identified that while RAG has seen many improvements, chunking solutions were significantly lacking. Existing methods were either too basic or overly resource-intensive, making the RAG pipeline costly and less accurate. Traditional chunking approaches (e.g., chunking every X tokens with Y overlap) were cheaper but led to higher runtime costs due to unnecessary context in LLM queries. Meanwhile, advanced semantic chunking solutions like Unstructured.io were prohibitively expensive and slow, limiting file uploads for users.

To address these issues and fulfill Toffee’s vision, VectorChat’s team created an algorithm that outperforms current industry solutions. Instead of developing proprietary models, they capitalized on the underdeveloped state of the field. The documentation provided includes all necessary information to develop solutions that match or exceed their current model.

Their aim is to reduce costs, enhance accuracy, and unlock new possibilities. As LLMs expand to include diverse datasets (e.g., audio, images, video), intelligent chunking becomes increasingly crucial.

The subnet is designed with a clear, transparent, and fair incentive system to push beyond current achievements. They are eager to see how miners will advance this technology.