ApeRAG
by apecloud
Full-stack RAG platform with built-in MCP integration
ai-ml Python Intermediate Self-hostable No API key
β 1.1k stars π
Updated: 2w ago
Description
ApeRAG by ApeCloud is a full-stack Retrieval-Augmented Generation platform that lets you ingest documents, build vector indexes, and query them through an AI agent via MCP. Unlike simple RAG wrappers, it handles the entire pipeline: document parsing (PDF, Word, HTML, Markdown), chunking, embedding, vector storage, and retrieval β all in one self-hosted package.
What makes ApeRAG stand out is its completeness. Most RAG solutions require you to glue together a document loader, an embedding model, a vector database, and a retrieval layer. ApeRAG bundles all of this and exposes the result as MCP tools. Your agent can ingest new documents, search existing knowledge bases, and get contextually relevant answers β without you managing any infrastructure beyond the ApeRAG server itself.
The project is backed by ApeCloud (known for KubeBlocks) and has over 1,000 GitHub stars. It is built in Python with good support for multiple embedding providers and LLMs. The trade-off is complexity: this is not a lightweight MCP server you can spin up in seconds. It is a full platform that needs storage, compute for embeddings, and proper configuration. But if you need a self-hosted RAG stack with MCP integration, it is one of the most complete options available.
β Best for
Teams that need a self-hosted RAG solution with MCP integration and cannot send documents to external services
βοΈ Skip if
You just need simple file search β a lightweight file system MCP server is much simpler
π‘ Use cases
- Build a private knowledge base from company docs and query it through your AI agent
- Ingest technical documentation (PDFs, wikis) and let your agent answer questions from them
- Create a self-hosted alternative to cloud RAG services with full data control
π Pros
- β Full-stack RAG pipeline in one package β no need to assemble separate components
- β Supports multiple document formats (PDF, Word, HTML, Markdown, and more)
- β Self-hosted with full data control, no documents leave your infrastructure
π Cons
- β Heavier setup than a typical MCP server β requires storage and compute for embeddings
- β Configuration complexity is high for production deployments
- β Python-based, so startup time and memory usage are higher than lightweight alternatives
π‘ Tips & tricks
Start with a small document set (10-20 files) to validate the pipeline before bulk
ingesting your full knowledge base. Use chunking strategies appropriate for your
content type β smaller chunks (256 tokens) work better for Q&A, larger ones (1024)
for summarization. Monitor embedding costs if using a paid provider like OpenAI.
Quick info
- Author
- apecloud
- License
- Apache-2.0
- Runtime
- Python
- Transport
- stdio
- Category
- ai-ml
- Difficulty
- Intermediate
- Self-hostable
- β
- API key
- No API key needed
- Docker
- β
- Version
- 0.0.0
- Updated
- Feb 5, 2026
Client compatibility
- β Claude Code
- β Cursor
- β VS Code Copilot
- β Gemini CLI
- β Windsurf
- β Cline
- β JetBrains AI
- β Warp
Platforms
π macOS π§ Linux πͺ Windows