Commit graph

5226 commits

Author SHA1 Message Date
yangdx
e19a4be0af Preserve ordering in get_by_ids methods across all storage implementations
- Fix result ordering in vector stores
- Update KV storage get_by_ids methods
- Maintain order in doc status storage
- Return None for missing IDs

(cherry picked from commit 9be22dd666)
2025-12-04 19:08:58 +08:00
yangdx
17106225dd Add PostgreSQL connection retry mechanism with comprehensive error handling
• Implement connection retry with backoff
• Add transient error detection
• Pool management with timeout guards

(cherry picked from commit e758204ab2)
2025-12-04 19:08:58 +08:00
yangdx
8f924d6f21 Add PostgreSQL connection retry configuration options
- Add retry environment variables
- Fix asyncpg import in retry tests

(cherry picked from commit bd535e3e7a)
2025-12-04 19:08:57 +08:00
yangdx
60a695539a Refactor PostgreSQL retry config to use centralized configuration
• Move retry config to ClientManager
• Remove env var parsing from PostgreSQLDB
• Add config params to test setup

(cherry picked from commit b3ed264707)
2025-12-04 19:08:57 +08:00
yangdx
d5154bca73 Condensed AGENTS.md to focus on essential development guidelines
(cherry picked from commit 8d3b53ce22)
2025-12-04 19:08:57 +08:00
yangdx
390842a6dd Rename Agments.md to AGENTS.md and standardize formatting
(cherry picked from commit 6e39c0c0ff)
2025-12-04 19:08:57 +08:00
yangdx
f56ba3b599 Add project intelligence files for AI agent collaboration
- Add .clinerules with technical patterns
- Create Agments.md for Codex agent guidance
- Ensures consistent behavior across all team members

(cherry picked from commit 577b9e6882)
2025-12-04 19:08:57 +08:00
yangdx
c6433edb23 Make PostgreSQL statement_cache_size configuration optional
• Remove forced int conversion
• Allow None values for cache size
• Add conditional parameter setting

(cherry picked from commit f2c0b41e78)
2025-12-04 19:08:57 +08:00
kevinnkansah
c8c73ab114 fix: renamed PostGreSQL options env variable and allowed LRU cache to be an optional env variable
(cherry picked from commit 22a7b482c5)
2025-12-04 19:08:56 +08:00
kevinnkansah
7ce46bacb6 feat: add options for PostGres connection
(cherry picked from commit 108cdbe133)
2025-12-04 19:08:56 +08:00
yangdx
fe05563ecb Remove future dependency and replace passlib with direct bcrypt
(cherry picked from commit fc44f11368)
2025-12-04 19:08:56 +08:00
yangdx
ad6b36143e Update .env loading and add API authentication to RAG evaluator
• Load .env from current directory
• Support LIGHTRAG_API_KEY auth header
• Override=False for env precedence
• Add Bearer token to API requests
• Enable per-instance .env configs

(cherry picked from commit 72db042667)
2025-12-04 19:08:56 +08:00
yangdx
6de4bb9113 Fix logging message formatting
(cherry picked from commit e0fd31a60d)
2025-12-04 19:08:46 +08:00
Lucky Verma
80dcbc696a Refactor SQL queries and improve input handling in PGKVStorage and PGDocStatusStorage
(cherry picked from commit 917e41aa78)
2025-12-04 19:08:41 +08:00
Won-Kyu Park
dd5b220e58 remove deprecated dotenv package.
(cherry picked from commit 532400412e)
2025-12-04 19:08:40 +08:00
yangdx
f142a8c375 Remove docling dependency and related packages from project
* Remove docling from pyproject.toml
* Update requirements files
* Clean up uv.lock dependencies
* Reduce offline docker image size

(cherry picked from commit f2b6a068e3)
2025-12-04 19:08:28 +08:00
yangdx
9a23234c6c Add build script for multi-platform images
- Add build script for multi-platform images
- Update docker deployment document

(cherry picked from commit ef79821f29)
2025-12-04 19:08:21 +08:00
yangdx
aa61e82820 Migrate from pip to uv package manager for faster builds
• Replace pip with uv in Dockerfile
• Remove constraints-offline.txt
• Add uv.lock for dependency pinning
• Use uv sync --frozen for builds

(cherry picked from commit 466de2070d)
2025-12-04 19:08:10 +08:00
yangdx
8c3a325193 docs: clarify docling exclusion in offline Docker image
(cherry picked from commit 388dce2e31)
2025-12-04 19:07:59 +08:00
yangdx
3f57a13e26 Add static 'offline' tag to Docker image metadata
(cherry picked from commit 19c05f9ea4)
2025-12-04 19:07:59 +08:00
yangdx
13963775d7 Optimize Docker build with multi-stage frontend compilation
• Add frontend build stage to Dockerfile
• Remove --production flag from bun install
• Fix frontend asset integration

(cherry picked from commit e5cbc593f4)
2025-12-04 19:07:59 +08:00
yangdx
077a2fdbb7 Remove explicit protobuf dependency from offline storage requirements
(cherry picked from commit bc1a70bad0)
2025-12-04 19:07:22 +08:00
yangdx
b0bdbb5839 Add offline deployment support with cache management and layered deps
• Add tiktoken cache downloader CLI
• Add layered offline dependencies
• Add offline requirements files
• Add offline deployment guide

(cherry picked from commit a5c05f1b92)
2025-12-04 19:07:09 +08:00
yangdx
770fd64c70 Preserve ordering in get_by_ids methods across all storage implementations
- Fix result ordering in vector stores
- Update KV storage get_by_ids methods
- Maintain order in doc status storage
- Return None for missing IDs

(cherry picked from commit 9be22dd666)
2025-12-04 19:06:54 +08:00
yangdx
de2713ca93 Add PostgreSQL connection retry mechanism with comprehensive error handling
• Implement connection retry with backoff
• Add transient error detection
• Pool management with timeout guards

(cherry picked from commit e758204ab2)
2025-12-04 19:06:30 +08:00
yangdx
89f32c4c49 Add PostgreSQL connection retry configuration options
- Add retry environment variables
- Fix asyncpg import in retry tests

(cherry picked from commit bd535e3e7a)
2025-12-04 19:06:25 +08:00
yangdx
39ad057384 Refactor PostgreSQL retry config to use centralized configuration
• Move retry config to ClientManager
• Remove env var parsing from PostgreSQLDB
• Add config params to test setup

(cherry picked from commit b3ed264707)
2025-12-04 19:06:06 +08:00
yangdx
492a4c9aa8 Condensed AGENTS.md to focus on essential development guidelines
(cherry picked from commit 8d3b53ce22)
2025-12-04 19:05:56 +08:00
yangdx
7de2cd98f6 Rename Agments.md to AGENTS.md and standardize formatting
(cherry picked from commit 6e39c0c0ff)
2025-12-04 19:05:56 +08:00
yangdx
c2c6ac3a45 Add AGENTS.md documentation section for AI coding agent guidance
(cherry picked from commit 1bf802eebf)
2025-12-04 19:05:56 +08:00
yangdx
06ceae92ab Add project intelligence files for AI agent collaboration
- Add .clinerules with technical patterns
- Create Agments.md for Codex agent guidance
- Ensures consistent behavior across all team members

(cherry picked from commit 577b9e6882)
2025-12-04 19:05:56 +08:00
yangdx
e0e228673c Make PostgreSQL statement_cache_size configuration optional
• Remove forced int conversion
• Allow None values for cache size
• Add conditional parameter setting

(cherry picked from commit f2c0b41e78)
2025-12-04 19:05:45 +08:00
Aleks Vujić
742e6958fe Fixed typo in log message when creating new graph file
(cherry picked from commit dd8f44e621)
2025-12-04 19:05:35 +08:00
kevinnkansah
8f5af8199b fix: renamed PostGreSQL options env variable and allowed LRU cache to be an optional env variable
(cherry picked from commit 22a7b482c5)
2025-12-04 19:05:24 +08:00
Tomek Cyran
4e93c9c21d Adding support for imagePullSecrets, envFrom, and deployment strategy in Helm chart
(cherry picked from commit 119d2fa171)
2025-12-04 19:05:14 +08:00
yangdx
c0ca40e366 Add doc_name field to full docs storage
- Store file_path in full_docs storage
- Update PostgreSQL implementation by map file_path to doc_name
- Other storage implementation automatically handles the new field

(cherry picked from commit 457d51952e)
2025-12-04 19:05:14 +08:00
kevinnkansah
85a4f0e68e feat: add options for PostGres connection
(cherry picked from commit 108cdbe133)
2025-12-04 19:05:14 +08:00
yangdx
5c8f3f9418 Remove future dependency and replace passlib with direct bcrypt
(cherry picked from commit fc44f11368)
2025-12-04 19:05:02 +08:00
yangdx
dec282694c Update .env loading and add API authentication to RAG evaluator
• Load .env from current directory
• Support LIGHTRAG_API_KEY auth header
• Override=False for env precedence
• Add Bearer token to API requests
• Enable per-instance .env configs

(cherry picked from commit 72db042667)
2025-12-04 19:04:25 +08:00
Raphaël MANSUY
6cca895ba9 Add logs for recent actions and decisions regarding upstream changes
- Documented major changes after pulling from upstream (HKUDS/LightRAG), focusing on multi-tenant support, security hardening, and RLS/RBAC.
- Created concise documentation under docs/diff_hku, including migration guides and security audits.
- Enumerated unmerged upstream commits and summarized substantive features and fixes.
- Outlined next steps for DB migrations, CI tests, and potential cherry-picking of upstream fixes.
2025-12-04 18:28:44 +08:00
Raphael MANSUY
2b292d4924
docs: Enterprise Edition & Multi-tenancy attribution (#5)
* Remove outdated documentation files: Quick Start Guide, Apache AGE Analysis, and Scratchpad.

* Add multi-tenant testing strategy and ADR index documentation

- Introduced ADR 008 detailing the multi-tenant testing strategy for the ./starter environment, covering compatibility and multi-tenant modes, testing scenarios, and implementation details.
- Created a comprehensive ADR index (README.md) summarizing all architecture decision records related to the multi-tenant implementation, including purpose, key sections, and reading paths for different roles.

* feat(docs): Add comprehensive multi-tenancy guide and README for LightRAG Enterprise

- Introduced `0008-multi-tenancy.md` detailing multi-tenancy architecture, key concepts, roles, permissions, configuration, and API endpoints.
- Created `README.md` as the main documentation index, outlining features, quick start, system overview, and deployment options.
- Documented the LightRAG architecture, storage backends, LLM integrations, and query modes.
- Established a task log (`2025-01-21-lightrag-documentation-log.md`) summarizing documentation creation actions, decisions, and insights.
2025-12-04 18:09:15 +08:00
Raphael MANSUY
fe9b8ec02a
tests: stabilize integration tests + skip external services; fix multi-tenant API behavior and idempotency (#4)
* feat: Implement multi-tenant architecture with tenant and knowledge base models

- Added data models for tenants, knowledge bases, and related configurations.
- Introduced role and permission management for users in the multi-tenant system.
- Created a service layer for managing tenants and knowledge bases, including CRUD operations.
- Developed a tenant-aware instance manager for LightRAG with caching and isolation features.
- Added a migration script to transition existing workspace-based deployments to the new multi-tenant architecture.

* chore: ignore lightrag/api/webui/assets/ directory

* chore: stop tracking lightrag/api/webui/assets (ignore in .gitignore)

* feat: Initialize LightRAG Multi-Tenant Stack with PostgreSQL

- Added README.md for project overview, setup instructions, and architecture details.
- Created docker-compose.yml to define services: PostgreSQL, Redis, LightRAG API, and Web UI.
- Introduced env.example for environment variable configuration.
- Implemented init-postgres.sql for PostgreSQL schema initialization with multi-tenant support.
- Added reproduce_issue.py for testing default tenant access via API.

* feat: Enhance TenantSelector and update related components for improved multi-tenant support

* feat: Enhance testing capabilities and update documentation

- Updated Makefile to include new test commands for various modes (compatibility, isolation, multi-tenant, security, coverage, and dry-run).
- Modified API health check endpoint in Makefile to reflect new port configuration.
- Updated QUICK_START.md and README.md to reflect changes in service URLs and ports.
- Added environment variables for testing modes in env.example.
- Introduced run_all_tests.sh script to automate testing across different modes.
- Created conftest.py for pytest configuration, including database fixtures and mock services.
- Implemented database helper functions for streamlined database operations in tests.
- Added test collection hooks to skip tests based on the current MULTITENANT_MODE.

* feat: Implement multi-tenant support with demo mode enabled by default

- Added multi-tenant configuration to the environment and Docker setup.
- Created pre-configured demo tenants (acme-corp and techstart) for testing.
- Updated API endpoints to support tenant-specific data access.
- Enhanced Makefile commands for better service management and database operations.
- Introduced user-tenant membership system with role-based access control.
- Added comprehensive documentation for multi-tenant setup and usage.
- Fixed issues with document visibility in multi-tenant environments.
- Implemented necessary database migrations for user memberships and legacy support.

* feat(audit): Add final audit report for multi-tenant implementation

- Documented overall assessment, architecture overview, test results, security findings, and recommendations.
- Included detailed findings on critical security issues and architectural concerns.

fix(security): Implement security fixes based on audit findings

- Removed global RAG fallback and enforced strict tenant context.
- Configured super-admin access and required user authentication for tenant access.
- Cleared localStorage on logout and improved error handling in WebUI.

chore(logs): Create task logs for audit and security fixes implementation

- Documented actions, decisions, and next steps for both audit and security fixes.
- Summarized test results and remaining recommendations.

chore(scripts): Enhance development stack management scripts

- Added scripts for cleaning, starting, and stopping the development stack.
- Improved output messages and ensured graceful shutdown of services.

feat(starter): Initialize PostgreSQL with AGE extension support

- Created initialization scripts for PostgreSQL extensions including uuid-ossp, vector, and AGE.
- Ensured successful installation and verification of extensions.

* feat: Implement auto-select for first tenant and KB on initial load in WebUI

- Removed WEBUI_INITIAL_STATE_FIX.md as the issue is resolved.
- Added useTenantInitialization hook to automatically select the first available tenant and KB on app load.
- Integrated the new hook into the Root component of the WebUI.
- Updated RetrievalTesting component to ensure a KB is selected before allowing user interaction.
- Created end-to-end tests for multi-tenant isolation and real service interactions.
- Added scripts for starting, stopping, and cleaning the development stack.
- Enhanced API and tenant routes to support tenant-specific pipeline status initialization.
- Updated constants for backend URL to reflect the correct port.
- Improved error handling and logging in various components.

* feat: Add multi-tenant support with enhanced E2E testing scripts and client functionality

* update client

* Add integration and unit tests for multi-tenant API, models, security, and storage

- Implement integration tests for tenant and knowledge base management endpoints in `test_tenant_api_routes.py`.
- Create unit tests for tenant isolation, model validation, and role permissions in `test_tenant_models.py`.
- Add security tests to enforce role-based permissions and context validation in `test_tenant_security.py`.
- Develop tests for tenant-aware storage operations and context isolation in `test_tenant_storage_phase3.py`.

* feat(e2e): Implement OpenAI model support and database reset functionality

* Add comprehensive test suite for gpt-5-nano compatibility

- Introduced tests for parameter normalization, embeddings, and entity extraction.
- Implemented direct API testing for gpt-5-nano.
- Validated .env configuration loading and OpenAI API connectivity.
- Analyzed reasoning token overhead with various token limits.
- Documented test procedures and expected outcomes in README files.
- Ensured all tests pass for production readiness.

* kg(postgres_impl): ensure AGE extension is loaded in session and configure graph initialization

* dev: add hybrid dev helper scripts, Makefile, docker-compose.dev-db and local development docs

* feat(dev): add dev helper scripts and local development documentation for hybrid setup

* feat(multi-tenant): add detailed specifications and logs for multi-tenant improvements, including UX, backend handling, and ingestion pipeline

* feat(migration): add generated tenant/kb columns, indexes, triggers; drop unused tables; update schema and docs

* test(backward-compat): adapt tests to new StorageNameSpace/TenantService APIs (use concrete dummy storages)

* chore: multi-tenant and UX updates — docs, webui, storage, tenant service adjustments

* tests: stabilize integration tests + skip external services; fix multi-tenant API behavior and idempotency

- gpt5_nano_compatibility: add pytest-asyncio markers, skip when OPENAI key missing, prevent module-level asyncio.run collection, add conftest
- Ollama tests: add server availability check and skip markers; avoid pytest collection warnings by renaming helper classes
- Graph storage tests: rename interactive test functions to avoid pytest collection
- Document & Tenant routes: support external_ids for idempotency; ensure HTTPExceptions are re-raised
- LightRAG core: support external_ids in apipeline_enqueue_documents and idempotent logic
- Tests updated to match API changes (tenant routes & document routes)
- Add logs and scripts for inspection and audit
2025-12-04 16:04:21 +08:00
Raphaël MANSUY
a5eb441124 feat: Add multi-tenant architecture ADRs and deployment guide
- Introduced ADR 007: Deployment Guide and Quick Reference, detailing multi-tenant architecture components, setup instructions, and testing procedures.
- Created DELIVERY_MANIFEST.txt summarizing the multi-tenant ADR delivery, including document purposes, lengths, and key insights.
- Added README.md as a comprehensive index for all ADRs, providing navigation paths and role-specific reading recommendations.
2025-11-20 15:27:31 +08:00
Copilot
27f016901d
Enhance LightRAG documentation with Bun, Drizzle ORM, and Hono for modern TypeScript migration (#2)
* Initial plan

* Add comprehensive Bun, Drizzle ORM, and Hono documentation

Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>

* Complete documentation update with Bun, Drizzle, and Hono integration

Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>

* Add Quick Start Guide and finalize documentation suite

Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>
2025-10-01 13:36:29 +08:00
Copilot
c1b935a0b9
Add comprehensive TypeScript migration documentation for LightRAG (#1)
* Initial plan

* Initialize reverse documentation directory and create analysis plan

Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>

* Add executive summary and comprehensive architecture documentation

Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>

* Add comprehensive data models and dependency migration documentation

Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>

* Complete comprehensive TypeScript migration documentation suite

Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: raphaelmansuy <1003084+raphaelmansuy@users.noreply.github.com>
2025-10-01 12:00:26 +08:00
yangdx
19073319c1 Add @tanstack/react-table dependency for table functionality
• Add react-table v8.21.3
• Include table-core dependency
• Update package.json
• Update bun.lock file
2025-10-01 00:32:19 +08:00
yangdx
86195c613e Fix linting 2025-09-29 13:10:25 +08:00
yangdx
df43afc89b Relax conversation history role validation requirements
• Remove strict role value checking
• Allow any non-empty string roles
2025-09-29 13:10:15 +08:00
yangdx
ba216787c1 Update webui assets 2025-09-28 22:51:06 +08:00
yangdx
924d459420 feat(webui): Enhance KaTeX rendering and add robust error handling
- Differentiates between inline ($...$) and display ($$..$$) math for proper styling and layout.
- Adds custom CSS to ensure formulas correctly inherit text color, fixing issues in dark/light themes.
- Implements responsive handling for long formulas by allowing horizontal scrolling, preventing page overflow.
- Introduces a silent `errorCallback` for KaTeX to suppress console errors from invalid LaTeX syntax in production, while retaining warnings in development.
- Refactors KaTeX plugin loading to be more robust and simplifies CSS import by moving it to `main.tsx`.
2025-09-28 22:50:33 +08:00