HubLens › Topics › RAG

// topic

RAG

15trending in last 90 days·15all-time

// new this month

// ecosystem

AI 14

Database 1

// recent newcomers

see all newcomers →

#1AgentKit Code Workshop Sample Repository🆕 6mo ago↗ 8.71/d★ 319 #2DeepTutor: Agent-Native Personalized Tutoring Platform🆕 4mo ago↗ 3.24/d★ 95

// this week's top 6

Tencent / WeKnora

WeKnora is an open-source, LLM-powered framework designed for enterprise-grade document understanding, semantic retrieval, and autonomous reasoning. It features a ReAct agent for complex multi-step tasks and a Wiki mode that distills raw documents into a structured, interlinked knowledge base. The platform supports multi-source data ingestion, various LLM integrations, and flexible deployment options to ensure complete data sovereignty.

garrytan / gbrain

GBrain provides a persistent, self-wiring knowledge graph that enables AI agents to store and retrieve complex information across meetings, emails, and documents. The system automatically extracts entity relationships and maintains a structured timeline, allowing agents to answer queries that standard vector search cannot reach. By utilizing a durable job queue and modular skill system, it ensures that agents become smarter and more reliable over time.

MemPalace / mempalace

MemPalace is a local-first AI memory system that stores conversation history as verbatim text for high-accuracy semantic retrieval. It utilizes a structured indexing approach with pluggable backends to organize content into wings, rooms, and drawers without requiring external API calls. The platform also features a temporal knowledge graph, MCP tools, and agent-specific diaries to provide comprehensive context management.

nashsu / llm_wiki

LLM Wiki is a cross-platform desktop application that transforms your documents into an organized, interlinked knowledge base using an incremental LLM-driven pipeline. It features a sophisticated two-step ingestion process, a persistent knowledge graph, and deep research capabilities to maintain and expand your personal library. The system ensures high-quality output through source traceability, human-in-the-loop review, and seamless integration with tools like Obsidian.

HKUDS / DeepTutor

DeepTutor is an agent-native platform designed to provide personalized, intelligent tutoring through a unified chat workspace and multi-agent architecture. It features advanced capabilities like a Book Engine for interactive learning, an AI Co-Writer, and persistent memory to tailor the experience to individual user profiles. Users can deploy the system easily via a guided CLI setup or Docker, supporting a wide range of LLM and embedding providers.

1jehuang / jcode

jcode is a high-performance coding agent harness designed for multi-session workflows and extreme resource efficiency. It features a sophisticated memory system that uses semantic vector embeddings to recall relevant information without excessive token usage. The platform supports native agent collaboration through a swarm architecture and integrates with a wide range of LLM providers via OAuth or custom configurations.

// all-time featured (15)

Tencent / WeKnora

WeKnora is an open-source, LLM-powered framework designed for enterprise-grade document understanding, semantic retrieval, and autonomous reasoning. It features a ReAct agent for complex multi-step tasks and a Wiki mode that distills raw documents into a structured, interlinked knowledge base. The platform supports multi-source data ingestion, various LLM integrations, and flexible deployment options to ensure complete data sovereignty.

HKUDS / RAG-Anything

RAG-Anything is a comprehensive framework designed to process and query diverse document types including text, images, tables, and mathematical equations. Built on LightRAG, it provides an end-to-end pipeline that integrates multimodal content into a unified knowledge graph for intelligent retrieval. This system eliminates the need for multiple specialized tools by offering a single, cohesive interface for complex document analysis.

bytedance / agentkit-samples

AgentKit Code Workshop is an AI Agent development platform sample repository launched by Volcengine, designed to help developers quickly master the construction and deployment of intelligent agents. The project provides a variety of code examples ranging from basic introductions to complex scenarios, covering core functions such as multi-agent collaboration, RAG retrieval enhancement, and tool invocation. Developers can use these tutorials to gain an in-depth understanding of the AgentKit development toolchain and integrate it efficiently into various business applications.

opendataloader-project / opendataloader-pdf

OpenDataLoader PDF is a high-performance, open-source parser designed to convert PDF documents into structured formats like Markdown, JSON, and HTML for AI and RAG pipelines. It features a hybrid processing mode that combines deterministic local parsing with AI-driven analysis to achieve industry-leading extraction accuracy for complex tables, formulas, and scanned documents. Additionally, the project provides automated accessibility solutions, including end-to-end Tagged PDF generation compliant with international standards.

pingcap / autoflow

AutoFlow is an open-source knowledge base tool that utilizes graph RAG technology built on TiDB Vector, LlamaIndex, and DSPy. The platform provides a Perplexity-style conversational search experience powered by an advanced built-in website crawler. Users can also integrate a customizable search widget into their own websites using a simple JavaScript snippet.

garrytan / gbrain

GBrain provides a persistent, self-wiring knowledge graph that enables AI agents to store and retrieve complex information across meetings, emails, and documents. The system automatically extracts entity relationships and maintains a structured timeline, allowing agents to answer queries that standard vector search cannot reach. By utilizing a durable job queue and modular skill system, it ensures that agents become smarter and more reliable over time.

memvid / memvid

Memvid is a database-free, single-file memory layer designed to provide AI agents with instant retrieval and long-term memory capabilities. Through an innovative "smart frame" design, it encapsulates data, embeddings, and indexes into a single file, achieving efficient compression and parallel reading. The system is model-agnostic and requires zero infrastructure dependencies, supporting persistent memory in various offline or online scenarios.

MemPalace / mempalace

MemPalace is a local-first AI memory system that stores conversation history as verbatim text for high-accuracy semantic retrieval. It utilizes a structured indexing approach with pluggable backends to organize content into wings, rooms, and drawers without requiring external API calls. The platform also features a temporal knowledge graph, MCP tools, and agent-specific diaries to provide comprehensive context management.

onyx-dot-app / onyx

Onyx is a feature-rich open source AI platform designed to provide an easy-to-deploy application layer interface for large language models. The platform supports RAG, deep research, code execution, and various AI agent capabilities, while remaining compatible with mainstream self-hosted and proprietary LLMs. Users can deploy via the standard or lightweight versions to meet different needs ranging from personal use to enterprise-level collaboration.

nashsu / llm_wiki

LLM Wiki is a cross-platform desktop application that transforms your documents into an organized, interlinked knowledge base using an incremental LLM-driven pipeline. It features a sophisticated two-step ingestion process, a persistent knowledge graph, and deep research capabilities to maintain and expand your personal library. The system ensures high-quality output through source traceability, human-in-the-loop review, and seamless integration with tools like Obsidian.

QMD is an on-device search engine that indexes markdown notes, documentation, and transcripts for efficient local retrieval. It utilizes a hybrid approach combining BM25 full-text search, vector semantic search, and LLM-based re-ranking to deliver high-quality results. The tool is designed for agentic workflows, offering both a command-line interface and an MCP server for seamless integration with AI agents.

endee-io / endee

Endee is a high-performance, open-source vector database specifically engineered for AI search, RAG pipelines, and semantic retrieval workloads. It is implemented in C++ and optimized for modern CPU architectures to ensure production-grade performance and low-latency results. The platform supports flexible deployment options, including Docker and local builds, while providing advanced features like hybrid search and metadata-aware filtering.

HKUDS / DeepTutor

DeepTutor is an agent-native platform designed to provide personalized, intelligent tutoring through a unified chat workspace and multi-agent architecture. It features advanced capabilities like a Book Engine for interactive learning, an AI Co-Writer, and persistent memory to tailor the experience to individual user profiles. Users can deploy the system easily via a guided CLI setup or Docker, supporting a wide range of LLM and embedding providers.

anthropics / claude-cookbooks

The Claude Cookbooks provide a comprehensive collection of code snippets and guides to help developers integrate Claude into their own applications. The repository covers a wide range of topics including tool use, multimodal capabilities, and advanced techniques like prompt caching. These resources are designed to be easily adaptable for various programming languages and project requirements.

1jehuang / jcode

jcode is a high-performance coding agent harness designed for multi-session workflows and extreme resource efficiency. It features a sophisticated memory system that uses semantic vector embeddings to recall relevant information without excessive token usage. The platform supports native agent collaboration through a swarm architecture and integrates with a wide range of LLM providers via OAuth or custom configurations.

// use cases by project

01RAG-based intelligent Q&A for enterprise documents
02Autonomous ReAct agents for multi-step reasoning and tool orchestration
03Automated Wiki generation and knowledge graph visualization from raw documents

01End-to-end processing of multimodal documents including PDFs, Office files, and images
02Construction of multimodal knowledge graphs for enhanced semantic understanding and relationship mapping
03Hybrid intelligent retrieval combining vector similarity search with graph traversal for context-aware answers

agentkit-samples

01Intelligent document Q&A and memory management based on RAG
02Multi-agent collaboration and distributed task processing
03Business process automation integrating the Volcengine toolchain

opendataloader-pdf

01Extracting structured data from PDFs for RAG and LLM pipelines with bounding box support
02Automating PDF accessibility compliance through layout analysis and auto-tagging
03Processing complex documents including scanned PDFs, mathematical formulas, and borderless tables

01Perplexity-style conversational search with automated sitemap URL scraping
02Embeddable JavaScript widget for instant product-related query responses on external sites
03Knowledge base management using graph RAG and TiDB for storing chat history and vector data

// comparisons

gbrain vs instant

// related topics

LLM (11)Knowledge Graph (5)Vector Database (5)Python (5)Agents (3)