LightRAG

Author	SHA1	Message	Date
clssck	082a5a8fad	test(lightrag,api): add comprehensive test coverage and S3 support Add extensive test suites for API routes and utilities: - Implement test_search_routes.py (406 lines) for search endpoint validation - Implement test_upload_routes.py (724 lines) for document upload workflows - Implement test_s3_client.py (618 lines) for S3 storage operations - Implement test_citation_utils.py (352 lines) for citation extraction - Implement test_chunking.py (216 lines) for text chunking validation Add S3 storage client implementation: - Create lightrag/storage/s3_client.py with S3 operations - Add storage module initialization with exports - Integrate S3 client with document upload handling Enhance API routes and core functionality: - Add search_routes.py with full-text and graph search endpoints - Add upload_routes.py with multipart document upload support - Update operate.py with bulk operations and health checks - Enhance postgres_impl.py with bulk upsert and parameterized queries - Update lightrag_server.py to register new API routes - Improve utils.py with citation and formatting utilities Update dependencies and configuration: - Add S3 and test dependencies to pyproject.toml - Update docker-compose.test.yml for testing environment - Sync uv.lock with new dependencies Apply code quality improvements across all modified files: - Add type hints to function signatures - Update imports and router initialization - Fix logging and error handling	2025-12-05 23:13:39 +01:00
clssck	dd1413f3eb	test(lightrag,examples): add prompt accuracy and quality tests Add comprehensive test suites for prompt evaluation: - test_prompt_accuracy.py: 365 lines testing prompt extraction accuracy - test_prompt_quality_deep.py: 672 lines for deep quality analysis - Refactor prompt.py to consolidate optimized variants (removed prompt_optimized.py) - Apply ruff formatting and type hints across 30 files - Update pyrightconfig.json for static type checking - Modernize reproduce scripts and examples with improved type annotations - Sync uv.lock dependencies	2025-12-05 16:39:52 +01:00
clssck	69358d830d	test(lightrag,examples,api): comprehensive ruff formatting and type hints Format entire codebase with ruff and add type hints across all modules: - Apply ruff formatting to all Python files (121 files, 17K insertions) - Add type hints to function signatures throughout lightrag core and API - Update test suite with improved type annotations and docstrings - Add pyrightconfig.json for static type checking configuration - Create prompt_optimized.py and test_extraction_prompt_ab.py test files - Update ruff.toml and .gitignore for improved linting configuration - Standardize code style across examples, reproduce scripts, and utilities	2025-12-05 15:17:06 +01:00
clssck	8d099fc3ac	chore: sync with upstream HKUDS/LightRAG - Add KaTeX extensions (mhchem for chemistry, copy-tex for copying) - Add CASCADE to AGE extension for PostgreSQL - Remove future dependency, replace passlib with bcrypt - Fix Jina embedding configuration and provider defaults - Update gunicorn help text and bump API version to 0258 - Documentation and README updates	2025-12-01 21:30:19 +01:00
yangdx	5f91063c7a	Add ruff as dependency to pytest and evaluation extras	2025-11-25 02:03:28 +08:00
Daniel.y	8777895efc	Merge pull request #2401 from danielaskdd/fix-openai-keyword-extraction Refactor: Centralize keyword_extraction parameter handling in OpenAI LLM implementations	2025-11-21 13:08:15 +08:00
yangdx	1e477e95ef	Add lightrag-clean-llmqc console script entry point - Add clean_llm_query_cache tool - New console script for cache cleanup - Extend CLI tool availability	2025-11-21 12:59:49 +08:00
yangdx	02fdceb959	Update OpenAI client to use stable API and bump minimum version to 2.0.0 - Remove beta prefix from completions.parse - Update OpenAI dependency to >=2.0.0 - Fix whitespace formatting - Update all requirement files - Clean up pyproject.toml dependencies	2025-11-21 12:55:44 +08:00
yangdx	472b498ade	Replace pytest group reference with explicit dependencies in evaluation • Remove pytest group dependency • Add explicit pytest>=8.4.2 • Add pytest-asyncio>=1.2.0 • Add pre-commit directly • Fix potential circular dependency	2025-11-18 12:17:21 +08:00
yangdx	5da82bb096	Add pre-commit to pytest dependencies and format test code • Add pre-commit to pytest extra deps • Update lock file dependencies	2025-11-18 00:42:04 +08:00
yangdx	b7b8d15632	Refactor pytest dependencies into separate optional group - Extract pytest deps to own group - Reference pytest group in evaluation - Add pytest config to pyproject.toml - Update uv.lock with new structure	2025-11-17 23:52:13 +08:00
yangdx	c246eff725	Improve docling integration with macOS compatibility and CLI flag - Add --docling CLI flag for easier setup - Add numpy version constraints - Exclude docling on macOS (fork-safety)	2025-11-17 12:54:32 +08:00
yangdx	7b7f93d77c	Implement lazy configuration initialization for API server • Add lazy config initialization • Maintain backward compatibility • Support programmatic usage • Add gunicorn dependency • Explicit config in entry points	2025-11-17 12:54:32 +08:00
yangdx	69a0b74ce7	refactor: move document deps to api group, remove dynamic imports - Merge offline-docs into api extras - Remove pipmaster dynamic installs - Add async document processing - Pre-check docling availability - Update offline deployment docs	2025-11-17 12:54:32 +08:00
yangdx	c434879c7a	Replace PyPDF2 with pypdf for PDF processing - Update import from PyPDF2 to pypdf - Change dependency to pypdf>=6.1.0 - Update all requirements files - Remove PyPDF2 from lock file - Use modern pypdf library	2025-11-17 12:54:32 +08:00
yangdx	e8f5f57ec7	Update qdrant-client minimum version from 1.7.0 to 1.11.0 • Bump qdrant-client to >=1.11.0 • Update pyproject.toml dependency • Update requirements files • Sync uv.lock with new version • Maintain <2.0.0 upper bound	2025-11-10 11:54:48 +08:00
yangdx	3d9de5ed03	feat: improve Gemini client error handling and retry logic • Add google-api-core dependency • Add specific exception handling • Create InvalidResponseError class • Update retry decorators • Fix empty response handling	2025-11-08 22:10:09 +08:00
yangdx	5f49cee20f	Merge branch 'main' into VOXWAVE-FOUNDRY/main	2025-11-06 15:37:35 +08:00
ben moussa anouar	5da709b42a	Merge branch 'main' into feat/ragas-evaluation	2025-11-03 06:01:46 +01:00
anouarbm	626b42bc40	feat: add optional Langfuse observability integration This contribution adds optional Langfuse support for LLM observability and tracing. Langfuse provides a drop-in replacement for the OpenAI client that automatically tracks all LLM interactions without requiring code changes. Features: - Optional Langfuse integration with graceful fallback - Automatic LLM request/response tracing - Token usage tracking - Latency metrics - Error tracking - Zero code changes required for existing functionality Implementation: - Modified lightrag/llm/openai.py to conditionally use Langfuse's AsyncOpenAI - Falls back to standard OpenAI client if Langfuse is not installed - Logs observability status on import Configuration: To enable Langfuse tracing, install the observability extras and set environment variables: ```bash pip install lightrag-hku[observability] export LANGFUSE_PUBLIC_KEY="your_public_key" export LANGFUSE_SECRET_KEY="your_secret_key" export LANGFUSE_HOST="https://cloud.langfuse.com" # or your self-hosted instance ``` If Langfuse is not installed or environment variables are not set, LightRAG will use the standard OpenAI client without any functionality changes. Changes: - Modified lightrag/llm/openai.py (added optional Langfuse import) - Updated pyproject.toml with optional 'observability' dependencies Dependencies (optional): - langfuse>=3.8.1	2025-11-01 21:40:22 +01:00
anouarbm	1ad0bf82f9	feat: add RAGAS evaluation framework for RAG quality assessment This contribution adds a comprehensive evaluation system using the RAGAS framework to assess LightRAG's retrieval and generation quality. Features: - RAGEvaluator class with four key metrics: * Faithfulness: Answer accuracy vs context * Answer Relevance: Query-response alignment * Context Recall: Retrieval completeness * Context Precision: Retrieved context quality - HTTP API integration for live system testing - JSON and CSV report generation - Configurable test datasets - Complete documentation with examples - Sample test dataset included Changes: - Added lightrag/evaluation/eval_rag_quality.py (RAGAS evaluator implementation) - Added lightrag/evaluation/README.md (comprehensive documentation) - Added lightrag/evaluation/__init__.py (package initialization) - Updated pyproject.toml with optional 'evaluation' dependencies - Updated .gitignore to exclude evaluation results directory Installation: pip install lightrag-hku[evaluation] Dependencies: - ragas>=0.3.7 - datasets>=4.3.0 - httpx>=0.28.1 - pytest>=8.4.2 - pytest-asyncio>=1.2.0	2025-11-01 21:36:39 +01:00
yangdx	c46c1b26a9	Add pycryptodome dependency for PDF encryption support	2025-10-31 01:49:42 +08:00
yangdx	16d3d82a0e	Include static files in package distribution - Add static dir to MANIFEST.in - Update package data config - Ensure static assets are bundled - Fix missing static file issue	2025-10-30 10:50:28 +08:00
dependabot[bot]	f81dd4e778	Update redis requirement from <7.0.0,>=5.0.0 to >=5.0.0,<8.0.0 Updates the requirements on [redis](https://github.com/redis/redis-py) to permit the latest version. - [Release notes](https://github.com/redis/redis-py/releases) - [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES) - [Commits](https://github.com/redis/redis-py/compare/v5.0.0...v7.0.1) --- updated-dependencies: - dependency-name: redis dependency-version: 7.0.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-10-27 18:39:04 +00:00
dependabot[bot]	ef4acf5365	Update pandas requirement from <2.3.0,>=2.0.0 to >=2.0.0,<2.4.0 Updates the requirements on [pandas](https://github.com/pandas-dev/pandas) to permit the latest version. - [Release notes](https://github.com/pandas-dev/pandas/releases) - [Commits](https://github.com/pandas-dev/pandas/compare/v2.0.0...v2.3.3) --- updated-dependencies: - dependency-name: pandas dependency-version: 2.3.3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-10-21 13:30:39 +00:00
dependabot[bot]	7469421452	Update openai requirement from <2.0.0,>=1.0.0 to >=1.0.0,<3.0.0 Updates the requirements on [openai](https://github.com/openai/openai-python) to permit the latest version. - [Release notes](https://github.com/openai/openai-python/releases) - [Changelog](https://github.com/openai/openai-python/blob/main/CHANGELOG.md) - [Commits](https://github.com/openai/openai-python/compare/v1.0.0...v2.6.0) --- updated-dependencies: - dependency-name: openai dependency-version: 2.6.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2025-10-20 17:38:08 +00:00
Humphry	0b3d31507e	extended to use gemini, sswitched to use gemini-flash-latest	2025-10-20 13:17:16 +03:00
yangdx	06ed2d06a9	Merge branch 'main' into remove-dotenv	2025-10-17 15:06:34 +08:00
Won-Kyu Park	532400412e	remove deprecated dotenv package.	2025-10-17 01:34:05 +09:00
yangdx	c61b7bd4f8	Remove torch and transformers from offline dependency groups	2025-10-16 15:14:25 +08:00
yangdx	f2b6a068e3	Remove docling dependency and related packages from project * Remove docling from pyproject.toml * Update requirements files * Clean up uv.lock dependencies * Reduce offline docker image size	2025-10-16 05:15:29 +08:00
yangdx	433ec813ba	Improve offline installation with constraints and version bounds • Add constraints-offline.txt for exact versions • Set upper bounds in pyproject.toml • Combine pip installs in Dockerfile • Update requirements with version bounds • Prevent dependency conflicts	2025-10-15 23:44:46 +08:00
yangdx	6d1ae40478	Add offline Docker build support with embedded models and cache - Add offline Dockerfile with tiktoken cache - Create GitHub workflow for offline builds - Update dockerignore for cleaner builds - Exclude dev dirs from package setup - Remove tiktoken volume from compose	2025-10-15 15:40:30 +08:00
yangdx	bc1a70bad0	Remove explicit protobuf dependency from offline storage requirements	2025-10-11 23:34:50 +08:00
yangdx	49197fbfc0	Update pymilvus to >=2.6.2 and add protobuf compatibility constraint	2025-10-11 13:27:10 +08:00
yangdx	a5c05f1b92	Add offline deployment support with cache management and layered deps • Add tiktoken cache downloader CLI • Add layered offline dependencies • Add offline requirements files • Add offline deployment guide	2025-10-11 10:28:14 +08:00
yangdx	194f46f239	Add json_repair dependency to project requirements	2025-08-27 11:14:09 +08:00
yangdx	b5682b15cb	Remove json-repair from core deps, add missing api deps	2025-08-25 07:23:41 +08:00
yangdx	14e083a1a6	fix: replace pyuca with pypinyin for Chinese pinyin sorting and add file_path sort	2025-08-17 15:21:24 +08:00
yangdx	d98fe6f340	Move json-repair to main dependencies	2025-08-01 19:59:39 +08:00
yangdx	32af45ff46	refactor: improve JSON parsing reliability with json-repair library Replace regex-based JSON extraction with json-repair for better handling of malformed LLM responses. Remove deprecated JSON parsing utilities and clean up keyword_extraction parameter across LLM providers. - Remove locate_json_string_body_from_string() and convert_response_to_json() - Use json-repair.loads() in extract_keywords_only() for robust parsing - Clean up LLM interfaces and remove unused parameters - Add json-repair dependency	2025-08-01 19:36:20 +08:00
yangdx	790abf148b	Add psutil to API dependencies	2025-08-01 10:55:43 +08:00
yangdx	44b7ce222e	feat: add default storage dependencies and optimize imports - Add nano-vectordb and networkx to pyproject.toml dependencies - Replace dynamic imports with direct imports for 4 default storage implementations - Improve startup performance while maintaining backward compatibility	2025-07-24 16:14:26 +08:00
Marvin Schmidt	42a1da0041	fix(build): pyproject.toml setup	2025-07-11 12:01:34 +02:00
yangdx	bd50827ffc	Update `pyproject.toml` to specify Python 3.10 as the minimum required version.	2025-07-05 23:00:22 +08:00
yangdx	2e2b9f3b48	Refactor `setup.py` to utilize `pyproject.toml` for project installation.	2025-07-05 11:19:00 +08:00

46 commits