vLLM

Active

GitHub Python Apache-2.0

Description

A high-throughput and memory-efficient inference and serving engine for LLMs, featuring PagedAttention, continuous batching, and optimized KV cache management for production deployments.

Related Projects

OpenLLM

12.3k · Python

Active

Run any open-source LLMs such as DeepSeek and Llama as OpenAI-compatible API endpoints in the cloud. Supports fine-tuning, quantization, and distributed inference for production-grade LLM deployment.

llmpythonapi +3

OpenRAG

4.1k · Python

Active

A comprehensive single-package Retrieval-Augmented Generation platform built on Langflow, Docling, and OpenSearch, providing a complete pipeline from document parsing to vector retrieval and generation with multi-model and multi-vector-database support.

ragllmframework +2

WrenAI

15.4k · Python

Active

Open-source text-to-SQL and text-to-chart GenBI agent with a semantic layer. Ask your database questions in natural language and get accurate SQL, charts, and BI insights. Supports 12+ data sources and any LLM.

llmtypescriptagent +2

LangExtract

36.8k · Python

Active

A Python library by Google for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization, designed for data annotation and knowledge extraction workflows.

data-processingllmpython +2

vLLM

Description

Tags

Categories

Related Projects

OpenLLM

OpenRAG

WrenAI

LangExtract