RAG Text Chunker & Overlap Calculator (Free Langchain Alternative)

RAG Text Chunker & Overlap Visualizer

Simulate Langchain’s RecursiveCharacterTextSplitter directly in your browser.

1. Configuration

300
How large each text block should be (OpenAI recommends ~1000 for standard RAG).
50
How much context to share between chunks so ideas aren’t cut in half.

2. Chunk Visualizer 0 Chunks

Highlighted text (yellow) represents the Overlap shared with the previous chunk.

3. Export Automation Script


If you are building a custom AI Chatbot to answer questions based on your company’s internal PDFs and databases, you are building a RAG (Retrieval-Augmented Generation) pipeline.

One of the biggest scams in the AI industry right now is “Managed RAG Platforms” charging B2B companies $100 to $500 a month just to process their documents. You do not need to pay for these wrappers. You can process, chunk, and embed your documents for free directly inside n8n or Make.com.

Use the visualizer tool above to simulate how the industry-standard RecursiveCharacterTextSplitter algorithm works, and copy the exact JavaScript required to build it into your own iPaaS platform.

Why You Can’t Just Send a 50-Page PDF to Pinecone or OpenAI

When an employee asks an AI a question, you want the AI to search your database, find the correct paragraph in your training manual, and generate an answer.

If you just extract the text from a 50-page PDF and try to push the entire 20,000-word string into a Vector Database (like Pinecone or Qdrant) as a single file, two things will happen:

  1. The Embedding API will crash: OpenAI’s embedding model (text-embedding-3-small) has a strict token limit per API call. It physically cannot process 50 pages at once.
  2. The AI will hallucinate: Vector databases search for semantic meaning. If a massive document contains information on HR policies, IT routing, and Sales commissions all mixed together, the search engine cannot figure out what the document is actually about.

What is Chunking and Chunk Overlap?

To solve this, developers use a process called Chunking. Before sending the document to the database, you slice the massive wall of text into small, bite-sized blocks (usually around 1,000 characters each).

The Problem with Hard Cuts

If you blindly chop a document every 1,000 characters, you will inevitably slice a sentence in half.

  • Chunk 1 ends with: “The most important rule for the sales team is to alway…”
  • Chunk 2 begins with: “…s log their calls in the CRM before Friday.”

If an employee asks the AI, “What is the most important rule for the sales team?”, the AI will find Chunk 1, but it won’t know the answer because the context was cut off.

The Solution: Chunk Overlap

To preserve context, we introduce Overlap. If your chunk size is 1,000 characters, and your overlap is 100 characters, Chunk 2 will step back and include the last 100 characters of Chunk 1. (You can see this happening in real-time in the visualizer above—the yellow highlighted text represents the overlap).

By overlapping the text, you ensure that no single idea, sentence, or paragraph is ever lost in the transition between chunks.

How to Implement This Chunker in n8n or Make.com

Python developers use the Langchain library to do this automatically. But if you are a RevOps engineer building in a visual no-code builder, you don’t have access to Langchain.

Chunking in n8n

If you are using n8n, this is incredibly easy. Select the n8n (Node.js) tab in the tool above and paste the script into an n8n Code Node.

Our script takes a single incoming item (your massive document), runs the while-loop math, and outputs multiple new items (your chunks). n8n handles this natively, allowing the very next node (your OpenAI Embedding node) to process every chunk seamlessly.

Chunking in Make.com

Make.com handles arrays differently than n8n. If you use the Make.com script generated above, it will output an Array containing your chunks. You must then connect a green “Iterator” module immediately after your script to split that array into individual bundles before passing them to your Vector Database module.


Ready to stop paying for expensive AI wrappers? Building a production-ready RAG pipeline doesn’t require a $500/month SaaS subscription. Download our complete n8n architecture template that extracts Google Drive PDFs, chunks them locally, and syncs them to Pinecone automatically.