HubLensTopicsLLM
// topic

LLM

79 trending in last 90 days ·79 all-time

// new this month

// this week's top 10

01
BerriAI / litellm
LiteLLM is an open-source AI gateway that provides a unified interface for calling over 100 different LLM providers using the standard OpenAI format. It can be utilized as a Python SDK for direct integration or deployed as a proxy server to manage enterprise-grade features like load balancing and spend tracking. The platform simplifies LLM management by eliminating the need to handle provider-specific SDKs, authentication patterns, and request formats.
9243,515
02
deepseek-ai / FlashMLA
FlashMLA is a library of high-performance attention kernels developed by DeepSeek to power their V3 and V3.2-Exp models. It provides specialized implementations for both sparse and dense attention mechanisms across prefill and decoding stages. The library is designed for NVIDIA GPU architectures and includes support for FP8 KV caching to enhance computational efficiency.
9212,557
03
bytedance / deer-flow
DeerFlow 2.0 is a ground-up rewrite of an open-source super agent framework designed to orchestrate sub-agents, memory, and sandboxes. It utilizes extensible skills and various LLM providers to perform complex tasks through a flexible, modular architecture. The platform supports multiple deployment modes, including Docker and local development, to facilitate efficient research and automation workflows.
9161,138
04
PaddlePaddle / PaddleOCR
PaddleOCR is a comprehensive toolkit designed to convert images and PDF documents into structured, LLM-ready data formats like Markdown and JSON. It features state-of-the-art vision-language models and high-performance text recognition engines that support over 100 languages. The platform is widely integrated into major AI agent and RAG frameworks, offering efficient deployment options across various hardware backends.
9175,510
05
alchaincyf / zhangxuefeng-skill
This project builds an executable thinking framework rather than a simple collection of quotes, based on Zhang Xuefeng's books, interviews, and decision records. It provides users with in-depth analysis and advice by distilling core mental models, decision heuristics, and communication DNA. Users can install it into Claude Code to perform professional selection and career planning analysis from Zhang Xuefeng's perspective.
895,997
06
toverainc / willow
The Willow Inference Server allows users to self-host high-speed language inference tasks for various applications. It supports a range of functionalities including speech-to-text, text-to-speech, and large language model processing. Users can access official documentation and community discussions to optimize their experience with the platform.
883,008
07
abi / secret-llama
Secret Llama is an entirely in-browser chatbot that allows users to interact with open-source models like Llama 3 and Mistral. It ensures complete privacy by keeping all conversation data locally on the user's computer without requiring a server or installation. The platform provides a user-friendly interface similar to ChatGPT while supporting offline functionality through WebGPU technology.
882,677
08
ant-design / x
Ant Design X provides a comprehensive suite of atomic components and utility APIs designed to help developers build high-quality AI applications. The library includes specialized tools for streaming Markdown rendering, dynamic card interfaces, and intelligent Agent skills. These enterprise-level components enable efficient data stream management and flexible UI development for modern AI-driven experiences.
884,468
09
XiaoMi / xiaomi-miloco
Xiaomi Miloco is an open-source smart home solution that leverages on-device large language models to integrate and control IoT devices. By utilizing camera data streams for visual understanding, the system enables users to manage their home environment through natural language commands. This framework prioritizes privacy and security by performing all processing locally while connecting seamlessly with the Xiaomi Home ecosystem.
882,522
10
THUDM / slime
slime is a high-performance post-training framework designed to scale reinforcement learning for large language models. It integrates Megatron-LM for efficient training with SGLang for flexible, high-throughput data generation. The framework supports a wide range of models and has been utilized in various research projects for agentic training and reasoning optimization.
885,333

// all-time featured (50)

BerriAI / litellm
LiteLLM is an open-source AI gateway that provides a unified interface for calling over 100 different LLM providers using the standard OpenAI format. It can be utilized as a Python SDK for direct integration or deployed as a proxy server to manage enterprise-grade features like load balancing and spend tracking. The platform simplifies LLM management by eliminating the need to handle provider-specific SDKs, authentication patterns, and request formats.
92
deepseek-ai / FlashMLA
FlashMLA is a library of high-performance attention kernels developed by DeepSeek to power their V3 and V3.2-Exp models. It provides specialized implementations for both sparse and dense attention mechanisms across prefill and decoding stages. The library is designed for NVIDIA GPU architectures and includes support for FP8 KV caching to enhance computational efficiency.
92
bytedance / deer-flow
DeerFlow 2.0 is a ground-up rewrite of an open-source super agent harness designed to orchestrate sub-agents, memory, and sandboxes. It utilizes extensible skills and integrates with various AI models to perform complex tasks through a flexible, containerized architecture. The framework supports multiple deployment modes and provides seamless connectivity with messaging platforms like Slack, Telegram, and Feishu.
92
bytedance / deer-flow
DeerFlow 2.0 is a ground-up rewrite of an open-source super agent framework designed to orchestrate sub-agents, memory, and sandboxes. It utilizes extensible skills and various LLM providers to perform complex tasks through a flexible, modular architecture. The platform supports multiple deployment modes, including Docker and local development, to facilitate efficient research and automation workflows.
91
PaddlePaddle / PaddleOCR
PaddleOCR is a comprehensive toolkit designed to convert images and PDF documents into structured, LLM-ready data formats like Markdown and JSON. It features state-of-the-art vision-language models and high-performance text recognition engines that support over 100 languages. The platform is widely integrated into major AI agent and RAG frameworks, offering efficient deployment options across various hardware backends.
91
alchaincyf / zhangxuefeng-skill
This project builds an executable thinking framework rather than a simple collection of quotes, based on Zhang Xuefeng's books, interviews, and decision records. It provides users with in-depth analysis and advice by distilling core mental models, decision heuristics, and communication DNA. Users can install it into Claude Code to perform professional selection and career planning analysis from Zhang Xuefeng's perspective.
89
toverainc / willow
The Willow Inference Server allows users to self-host high-speed language inference tasks for various applications. It supports a range of functionalities including speech-to-text, text-to-speech, and large language model processing. Users can access official documentation and community discussions to optimize their experience with the platform.
88
abi / secret-llama
Secret Llama is an entirely in-browser chatbot that allows users to interact with open-source models like Llama 3 and Mistral. It ensures complete privacy by keeping all conversation data locally on the user's computer without requiring a server or installation. The platform provides a user-friendly interface similar to ChatGPT while supporting offline functionality through WebGPU technology.
88
ant-design / x
Ant Design X provides a comprehensive suite of atomic components and utility APIs designed to help developers build high-quality AI applications. The library includes specialized tools for streaming Markdown rendering, dynamic card interfaces, and intelligent Agent skills. These enterprise-level components enable efficient data stream management and flexible UI development for modern AI-driven experiences.
88
XiaoMi / xiaomi-miloco
Xiaomi Miloco is an open-source smart home solution that leverages on-device large language models to integrate and control IoT devices. By utilizing camera data streams for visual understanding, the system enables users to manage their home environment through natural language commands. This framework prioritizes privacy and security by performing all processing locally while connecting seamlessly with the Xiaomi Home ecosystem.
88
THUDM / slime
slime is a high-performance post-training framework designed to scale reinforcement learning for large language models. It integrates Megatron-LM for efficient training with SGLang for flexible, high-throughput data generation. The framework supports a wide range of models and has been utilized in various research projects for agentic training and reasoning optimization.
88
alchaincyf / hermes-agent-orange-book
This comprehensive guide provides a practical overview of the Hermes Agent framework developed by Nous Research. It covers core mechanisms like the self-improving learning loop, three-layer memory system, and automatic skill evolution across seventeen detailed chapters. The book serves as a resource for developers and AI enthusiasts looking to implement and customize autonomous AI agents.
88
openocta / openocta
OpenOcta is an open-source enterprise AI Agent runtime that provides a complete control plane including gateways, proxies, and automated tasks via a single Go binary. The project supports deep integration with business systems, APIs, MCP tools, and custom skills, making it suitable for process automation and intelligent conversation scenarios. Its architecture is designed for simplicity, with frontend resources embedded directly into the binary to ensure rapid deployment and efficient operation in enterprise environments.
88
hotcoffeeshake / tong-jincheng-skill
This project constructs an AI analysis tool capable of simulating his straightforward, anti-cliché style by deeply distilling approximately 200,000 words of Tong Jincheng's original video content. Users can invoke this Skill via Claude Code to obtain deep insights into dating, interpersonal relationships, and personal growth. It does not simply repeat quotes, but instead utilizes Tong Jincheng's unique cognitive framework to help users analyze and solve practical problems.
88
KKKKhazix / khazix-skills
Khazix Skills is an open-source collection of AI tools designed to transform the author's accumulated methodologies into reusable Prompts and Skills. The project includes lightweight prompt templates and structured instruction sets that adhere to the Agent Skills open standard. Users can integrate these tools into supported AI Agents via direct installation or manual configuration to enhance work efficiency.
88
RKiding / Awesome-finance-skills
Awesome-finance-skills is a plug-and-play collection of financial skills designed to empower Large Language Models with real-time news analysis, stock data processing, and market forecasting capabilities. The project supports various mainstream Agent frameworks, allowing AI to possess professional financial analysis and research report generation functions through simple installation. Users can leverage its built-in logic visualization and sentiment analysis tools to rapidly enhance the decision-support level of AI agents in the financial sector.
88
Tencent / AI-Infra-Guard
AI-Infra-Guard is a professional AI red teaming security assessment platform developed by Tencent Zhuque Lab, designed to provide comprehensive AI security risk self-inspection solutions for enterprises and individuals. The platform integrates core functions such as AI infrastructure vulnerability scanning, Agent workflow security assessment, MCP server scanning, and jailbreak testing. Users can deploy it quickly via Docker and utilize its modern Web interface and robust API to achieve efficient security detection and management.
88
deepseek-ai / FlashMLA
FlashMLA is a library of high-performance attention kernels developed by DeepSeek to power their V3 and V3.2-Exp models. The repository provides specialized implementations for both sparse and dense attention mechanisms during prefill and decoding stages. These kernels are optimized for NVIDIA GPU architectures, including SM90 and SM100, to achieve significant computational throughput.
86
khoj-ai / khoj
Khoj is a versatile personal AI application designed to extend user capabilities through advanced semantic search and document integration. It supports a wide range of local and online LLMs while offering flexible deployment options from on-device to cloud-scale enterprise environments. Users can create custom agents and automate research tasks across various platforms including Obsidian, Emacs, and mobile devices.
82
forrestchang / andrej-karpathy-skills
This project provides a set of structured guidelines designed to improve the performance and reliability of AI coding agents. By implementing four core principles, it helps developers mitigate common LLM pitfalls like overcomplication, unnecessary code changes, and poor assumption management. Users can easily integrate these rules into their workflows via a Claude Code plugin or a project-specific CLAUDE.md file.
82
PaddlePaddle / FastDeploy
FastDeploy is an inference deployment toolkit for large language models and vision-language models based on PaddlePaddle, designed to provide out-of-the-box production-grade deployment solutions. The project supports various hardware platforms and integrates core technologies such as load-balanced PD separation, unified KV cache transmission, and full quantization format support. Developers can achieve rapid deployment through OpenAI API-compatible interfaces and leverage advanced acceleration techniques like speculative decoding and chunked prefill to enhance inference performance.
82
alibaba / page-agent
Page Agent is a client-side tool that enables users to control web interfaces using natural language commands. It operates directly within the webpage using text-based DOM manipulation, eliminating the need for browser extensions, headless browsers, or multi-modal LLMs. The library supports flexible LLM integration and provides optional extensions for multi-page task automation.
82
Tencent / WeKnora
WeKnora is an intelligent knowledge management and Q&A framework that utilizes LLMs to provide enterprise-grade document understanding and semantic retrieval. The platform offers both a RAG-based Quick Q&A mode for fast queries and a ReACT Agent engine for complex, multi-source reasoning tasks. It features a highly modular architecture that supports various document formats, multiple LLM providers, and seamless integration with popular IM channels for private or local deployment.
82
WeaveMindAI / weft
Weft is a programming language designed to integrate LLMs, human interactions, and infrastructure into a unified, visual workflow. It features durable execution to ensure programs survive crashes and supports complex logic through a typed, modular node system. Developers can build and manage sophisticated agentic systems by wiring together native nodes without the need for manual plumbing.
78
yizhiyanhua-ai / fireworks-tech-graph
fireworks-tech-graph enables users to generate professional SVG and PNG technical diagrams directly from natural language descriptions. The tool supports 14 UML diagram types and includes 7 distinct visual styles tailored for various documentation needs. It is specifically optimized for AI and agent-based domain patterns, allowing for rapid visualization without manual drawing.
78
thedotmack / claude-mem
Claude-Mem is a persistent memory compression system designed to maintain context across sessions for Claude Code and similar CLI tools. It automatically captures tool usage observations and generates semantic summaries to ensure continuity of project knowledge. The system utilizes a hybrid search architecture with SQLite and vector databases to provide efficient, token-conscious information retrieval.
78
safishamsi / graphify
graphify is an AI coding assistant skill that builds a comprehensive knowledge graph from your codebase, documentation, and multimedia files. It utilizes tree-sitter for structural code analysis and LLM-based extraction to identify concepts, relationships, and architectural design rationales. The resulting interactive graph and reports allow developers to navigate complex codebases and understand architectural decisions more efficiently.
78
shanraisshan / claude-code-best-practice
This repository provides a comprehensive collection of best practices, implementation guides, and orchestration workflows for Claude Code. It covers essential concepts such as subagents, custom commands, skills, and memory management to enhance agentic engineering. Developers can leverage these resources to optimize their development loops and integrate advanced AI-driven automation into their projects.
78
rtk-ai / rtk
RTK is a high-performance CLI proxy designed to significantly reduce LLM token consumption by filtering and compressing command outputs. It supports over 100 common commands and integrates seamlessly with various AI coding tools via transparent shell hooks. By removing noise and summarizing data, it helps developers optimize their AI interactions with minimal overhead.
78
pbakaus / impeccable
Impeccable is a comprehensive design skill that provides AI agents with domain-specific references and steering commands to improve frontend UI quality. It combats generic AI design patterns by offering 18 specialized commands for auditing, polishing, and refining visual and interaction design. Additionally, the project includes a standalone CLI tool to detect common design anti-patterns across various files and URLs.
78
luongnv89 / claude-howto
This guide provides a structured, visual learning path to help developers master Claude Code beyond basic usage. It features copy-paste templates, interactive quizzes, and detailed tutorials covering everything from slash commands to complex agent workflows. By following these modules, users can effectively combine features to build automated pipelines for code reviews, security scans, and documentation.
78
farion1231 / cc-switch
CC Switch is a desktop application that provides a centralized interface for managing multiple AI-powered coding CLI tools including Claude Code, Codex, and Gemini CLI. It eliminates the need for manual configuration file editing by offering over 50 provider presets and a visual management system for MCP servers and skills. The tool also features cross-platform support, cloud synchronization, and built-in usage tracking to streamline AI-assisted development workflows.
78
NousResearch / hermes-agent
Hermes Agent is a self-improving AI assistant that builds skills from experience and maintains a deepening model of user interactions across sessions. It supports a wide range of LLM providers and can be deployed on diverse infrastructure, including local machines, VPS, or serverless environments. The platform features a robust messaging gateway for cross-platform communication and includes built-in tools for scheduled automation and parallel task delegation.
78
JuliusBrussee / caveman
Caveman is a specialized plugin for AI coding agents that forces responses into a concise, telegraphic style to significantly reduce token usage. It maintains full technical accuracy while stripping away filler words, pleasantries, and unnecessary prose. The tool supports multiple intensity levels, including a classical Chinese mode, and provides utilities for compressing project documentation.
78
meituan / EvoCUA
EvoCUA is a high-performance open-source multimodal model designed for end-to-end computer automation across various desktop applications. It currently holds the top position on the OSWorld leaderboard and demonstrates superior cross-OS generalization capabilities. Additionally, the model is recognized for its robust safety profile, exhibiting the lowest rate of unintended behaviors among leading computer-use agents.
78
bytedance / agentkit-samples
AgentKit Code Workshop is a companion sample repository for the AI Agent development platform launched by Volcano Engine, designed to help developers quickly master the agent construction and deployment process. The project provides a variety of code examples ranging from basic introductions to complex business scenarios, covering core functions such as multi-agent collaboration, RAG retrieval augmentation, and tool calling. Developers can use these tutorials to gain an in-depth understanding of the AgentKit development toolchain, thereby efficiently implementing various intelligent applications.
78
baidu / vLLM-Kunlun
vLLM Kunlun is a community-maintained hardware plugin that enables seamless execution of vLLM on Kunlun XPU devices. It utilizes a hardware-pluggable interface to decouple the Kunlun backend from the core vLLM framework. This integration allows users to run a wide range of Transformer, Mixture-of-Expert, and multimodal models efficiently on Kunlun3 P800 hardware.
78
alibaba / rtp-llm
RTP-LLM is a high-performance large language model inference acceleration engine developed by the Alibaba Foundation Model Inference Team. This engine has been widely applied in various Alibaba business scenarios such as Taobao and Tmall, and it supports multiple mainstream model formats and hardware backends. By integrating advanced operator optimization, quantization techniques, and distributed inference capabilities, it provides developers with efficient production-grade inference solutions.
78
PaddlePaddle / PaddleFormers
PaddleFormers is a Transformers library built on the PaddlePaddle framework, designed to provide training interfaces for Large Language Models and Vision-Language Models equivalent to Hugging Face. By integrating tensor parallelism, pipeline parallelism, and automatic mixed precision, the project achieves training performance that surpasses Megatron-LM on key models. Furthermore, it fully supports the Safetensors format and is deeply adapted to various domestic computing chips, helping developers efficiently complete the full model training process.
78
titanwings / colleague-skill
colleague.skill allows users to create AI personas based on the work habits and personalities of their colleagues by processing various communication data sources. The system utilizes a two-part architecture that combines professional work capabilities with nuanced behavioral traits to simulate realistic interactions. Users can easily manage, update, and roll back these AI skills to maintain accurate representations of their team members' expertise.
78
shareAI-lab / learn-claude-code
This repository provides a comprehensive educational framework for building the infrastructure, or harness, required to support intelligent AI agents. It emphasizes that while agency is derived from trained models, the harness is essential for providing the tools, context, and environment necessary for effective operation. Through twelve progressive sessions, developers learn to implement key mechanisms like tool dispatching, task management, and subagent coordination.
78
getpaseo / paseo
Paseo provides a unified interface to manage and run various coding agents like Claude Code, Codex, and OpenCode on your local machine. It supports cross-device workflows, allowing users to interact with agents through desktop, mobile, web, or CLI applications. The platform prioritizes privacy by operating without telemetry or forced logins while enabling powerful agent orchestration capabilities.
78
alibaba / tair-kvcache
Tair KVCache is an Alibaba Cloud system designed to accelerate Large Language Model inference through distributed memory pooling and dynamic multi-level caching. The project provides a centralized manager for unified metadata handling and a simulation tool for predicting performance metrics without requiring GPU resources. These components work together to improve inference efficiency while reducing overall infrastructure costs.
78
vigorX777 / ai-daily-digest
AI Daily Digest is an automated tool running on the Bun runtime, designed to scrape and filter high-quality content from 90 top technical blogs. Through AI multi-dimensional scoring, structured summaries, and trend analysis, the tool generates a daily technical digest for users that includes multi-dimensional statistical charts. The system uses the Gemini API by default and supports flexible switching to other OpenAI-compatible AI model providers.
76
vigorX777 / ai-daily-digest
AI Daily Digest is a tool powered by the Bun runtime that scrapes RSS feeds from 90 top-tier technical blogs, utilizing AI for multi-dimensional scoring, categorization, and summary generation. The project supports interactive command-line operations and automatically organizes articles into a structured daily digest featuring macro trends, in-depth summaries, and visual statistics. Users can flexibly configure Gemini or other OpenAI-compatible API models to achieve efficient technical information acquisition and reading.
76
THUDM / slime
Slime is an LLM post-training framework designed for reinforcement learning scaling by integrating Megatron for high-performance training and SGLang for efficient rollout generation. The framework utilizes a data buffer to bridge training and generation, enabling flexible and asynchronous workflows for complex RL tasks. It supports a wide range of state-of-the-art models, including the GLM, Qwen, DeepSeek, and Llama series.
76
vbgate / learn-opencode
OpenCode is a free, open-source AI practical course designed for beginners, aiming to help users master methods to improve work efficiency using AI within 4 hours. The tutorial provides in-depth Chinese content and supports direct connection to mainstream domestic models without complex network configurations. The course covers five stages from quick start to deep customization, and provides rich practical projects and Prompt templates for learners to use.
72
alibaba / ROLL
ROLL is an efficient, user-friendly reinforcement learning library specifically designed for training and scaling Large Language Models on large-scale GPU clusters. It utilizes a multi-role distributed architecture powered by Ray to support complex tasks like human preference alignment, reasoning, and agentic interactions. The framework integrates advanced technologies such as Megatron-Core, vLLM, and SGLang to accelerate model training and inference across diverse hardware environments.
72
RKiding / Awesome-finance-skills
Awesome Finance Skills is a plug-and-play collection of financial skills designed to empower AI Agents with real-time news analysis, stock data processing, and market forecasting capabilities. The project supports various mainstream Agent frameworks, allowing users to quickly integrate relevant features through one-click installation or manual deployment. By integrating the Kronos model with multiple financial data sources, it helps Agents automatically generate professional market analysis reports and logical chain diagrams.
72
PaddlePaddle / FastDeploy
FastDeploy is an inference deployment toolkit for large language models and vision-language models based on PaddlePaddle, aiming to provide out-of-the-box production-grade deployment solutions. The toolkit supports various mainstream hardware platforms and integrates core technologies such as load-balanced PD separation, unified KV cache transmission, and full quantization format support. By being compatible with OpenAI API and vLLM interfaces, it helps developers efficiently implement model inference and online service deployment.
72

// related topics