A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
-
Updated
Mar 27, 2025 - Python
A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows
A sentence splitting (sentence boundary disambiguation) library for Go. It is rule-based and works out-of-the-box.
JChunk is a lightweight and flexible library designed to provide multiple strategies for text chunking within Spring Boot applications
An exploration of text splitting and chunking in JavaScript
A web app that allows users to upload PDFs and interact with them through a Q&A interface. The application extracts text from PDFs, generates embeddings, stores them in a FAISS database, and retrieves relevant information to provide context-aware answers using a large language model .
Text splitting example using Tiktoken
LangChain is a framework, which is very helpful and easy to build applications based on available Large Language Models.
Specialized markdown text splitter - part of LEDAA project's data ingestion pipeline for RAG.
This is an experiment in learning langchain, pinecone and stuff, don't mind
Matching strings between lists based on length
Add a description, image, and links to the text-splitting topic page so that developers can more easily learn about it.
To associate your repository with the text-splitting topic, visit your repo's landing page and select "manage topics."