Merge remote-tracking branch 'upstream/main'

2025-10-09 12:33:11 +02:00 · 2025-10-09 12:33:11 +02:00 · bb4d8181d5
commit bb4d8181d5
parent a57d4ec0cc 577b9e6882
7 changed files with 378 additions and 4 deletions
--- a/.clinerules/01-basic.md
+++ b/.clinerules/01-basic.md
@ -0,0 +1,207 @@
 # LightRAG Project Intelligence (.clinerules)
 ## Project Overview
 LightRAG is a mature, production-ready Retrieval-Augmented Generation (RAG) system with comprehensive knowledge graph capabilities. The system has evolved from experimental to production-ready status with extensive functionality across all major components.
 ## Current System State (August 15, 2025)
 - **Status**: Production Ready - Stable and Mature
 - **Configuration**: Gemini 2.5 Flash + BAAI/bge-m3 embeddings via custom endpoints
 - **Storage**: Default in-memory with file persistence (JsonKVStorage, NetworkXStorage, NanoVectorDBStorage)
 - **Language**: Chinese for summaries
 - **Workspace**: `space1` for data isolation
 - **Authentication**: JWT-based with admin/user accounts
 ## Critical Implementation Patterns
 ### 1. Embedding Format Compatibility (CRITICAL)
 **Pattern**: Always handle both base64 and raw array embedding formats
 **Location**: `lightrag/llm/openai.py` - `openai_embed` function
 **Issue**: Custom OpenAI-compatible endpoints return embeddings as raw arrays, not base64 strings
 **Solution**:
 ```python
 np.array(dp.embedding, dtype=np.float32) if isinstance(dp.embedding, list)
 else np.frombuffer(base64.b64decode(dp.embedding), dtype=np.float32)
 ```
 **Impact**: Document processing fails completely without this dual format support
 ### 2. Async Pattern Consistency (CRITICAL)
 **Pattern**: Always await coroutines before calling methods on the result
 **Common Error**: `coroutine.method()` instead of `(await coroutine).method()`
 **Locations**: MongoDB implementations, Neo4j operations
 **Example**: `await self._data.list_indexes()` then `await cursor.to_list()`
 ### 3. Storage Layer Data Compatibility (CRITICAL)
 **Pattern**: Always filter deprecated/incompatible fields during deserialization
 **Common Fields to Remove**: `content`, `_id` (MongoDB), database-specific fields
 **Implementation**: `data.pop('field_name', None)` before creating dataclass objects
 **Locations**: All storage implementations (JSON, Redis, MongoDB, PostgreSQL)
 ### 4. Lock Key Generation (CRITICAL)
 **Pattern**: Always sort relationship pairs for consistent lock keys
 **Implementation**: `sorted_key_parts = sorted([src, tgt])` then `f"{sorted_key_parts[0]}-{sorted_key_parts[1]}"`
 **Impact**: Prevents deadlocks in concurrent relationship processing
 ### 5. Event Loop Management (CRITICAL)
 **Pattern**: Handle event loop mismatches during shutdown gracefully
 **Implementation**: Timeout + specific RuntimeError handling for "attached to a different loop"
 **Location**: Neo4j storage finalization
 **Impact**: Prevents application shutdown failures
 ## Architecture Patterns
 ### 1. Dependency Injection
 **Pattern**: Pass configuration through object constructors, not direct imports
 **Example**: OllamaAPI receives configuration through LightRAG object
 **Benefit**: Better testability and modularity
 ### 2. Memory Bank Documentation
 **Pattern**: Maintain comprehensive memory bank for development continuity
 **Structure**: Core files (projectbrief.md, activeContext.md, progress.md, etc.)
 **Purpose**: Essential for context preservation across development sessions
 ### 3. Configuration Management
 **Pattern**: Centralize defaults in constants.py, use environment variables for runtime config
 **Implementation**: Default values in constants, override via .env file
 **Benefit**: Consistent configuration across components
 ## Development Workflow Patterns
 ### 1. Frontend Development (CRITICAL)
 **Package Manager**: **ALWAYS USE BUN** - Never use npm or yarn unless Bun is unavailable
 **Commands**:
 - `bun install` - Install dependencies
 - `bun run dev` - Start development server
 - `bun run build` - Build for production
 - `bun run lint` - Run linting
 - `bun test` - Run tests
 - `bun run preview` - Preview production build
 **Pattern**: All frontend operations must use Bun commands
 **Fallback**: Only use npm/yarn if Bun installation fails
 **Testing**: Use `bun test` for all frontend testing
 ### 2. Bug Fix Approach
 1. **Identify root cause** - Don't just fix symptoms
 2. **Implement robust solution** - Handle edge cases and format variations
 3. **Maintain backward compatibility** - Preserve existing functionality
 4. **Add comprehensive error handling** - Graceful degradation
 5. **Document the fix** - Update memory bank with technical details
 ### 3. Feature Implementation
 1. **Follow existing patterns** - Maintain architectural consistency
 2. **Use dependency injection** - Avoid direct imports between modules
 3. **Implement comprehensive error handling** - Handle all failure modes
 4. **Add proper logging** - Debug and warning messages
 5. **Update documentation** - Memory bank and code comments
 6. **Comment Language** - Use English for comments and documentation
 ### 4. Performance Optimization
 1. **Profile before optimizing** - Identify actual bottlenecks
 2. **Maintain algorithmic correctness** - Don't sacrifice functionality for speed
 3. **Use appropriate data structures** - Match structure to access patterns
 4. **Implement caching strategically** - Cache expensive operations
 5. **Monitor memory usage** - Prevent memory leaks
 ## Technology Stack Intelligence
 ### 1. LLM Integration
 - **Primary**: Gemini 2.5 Flash via custom endpoint
 - **Embedding**: BAAI/bge-m3 via custom endpoint
 - **Reranking**: BAAI/bge-reranker-v2-m3
 - **Pattern**: Always handle multiple provider formats
 ### 2. Storage Backends
 - **Default**: In-memory with file persistence
 - **Production Options**: PostgreSQL, MongoDB, Redis, Neo4j
 - **Pattern**: Abstract storage interface with multiple implementations
 ### 3. API Architecture
 - **Framework**: FastAPI with Gunicorn for production
 - **Authentication**: JWT-based with role support
 - **Compatibility**: Ollama-compatible endpoints for easy integration
 ### 4. Frontend
 - **Framework**: React with TypeScript
 - **Package Manager**: **BUN (REQUIRED)** - Always use Bun for all frontend operations
 - **Build Tool**: Vite with Bun runtime
 - **Visualization**: Sigma.js for graph rendering
 - **State Management**: React hooks with context
 - **Internationalization**: i18next for multi-language support
 ## Common Pitfalls and Solutions
 ### 1. Embedding Format Issues
 **Pitfall**: Assuming all endpoints return base64-encoded embeddings
 **Solution**: Always check format and handle both base64 and raw arrays
 ### 2. Async/Await Patterns
 **Pitfall**: Calling methods on coroutines instead of awaited results
 **Solution**: Always await coroutines before accessing their methods
 ### 3. Data Model Evolution
 **Pitfall**: Breaking changes when removing fields from dataclasses
 **Solution**: Filter deprecated fields during deserialization, don't break storage
 ### 4. Concurrency Issues
 **Pitfall**: Inconsistent lock key generation causing deadlocks
 **Solution**: Always sort keys for deterministic lock ordering
 ### 5. Event Loop Management
 **Pitfall**: Event loop mismatches during shutdown
 **Solution**: Implement timeout and specific error handling for loop issues
 ## Performance Considerations
 ### 1. Query Context Building
 - **Algorithm**: Linear gradient weighted polling for fair resource allocation
 - **Optimization**: Round-robin merging to eliminate mode bias
 - **Pattern**: Smart chunk selection based on cross-entity occurrence
 ### 2. Graph Operations
 - **Optimization**: Batch operations where possible
 - **Pattern**: Use appropriate indexing for large datasets
 - **Consideration**: Memory usage with large graphs
 ### 3. LLM Request Management
 - **Pattern**: Priority-based queue for request ordering
 - **Optimization**: Connection pooling and retry mechanisms
 - **Consideration**: Rate limiting and cost management
 ## Security Patterns
 ### 1. Authentication
 - **Implementation**: JWT tokens with role-based access
 - **Pattern**: Stateless authentication with configurable expiration
 - **Security**: Proper token validation and refresh mechanisms
 ### 2. API Security
 - **Pattern**: Input validation and sanitization
 - **Implementation**: FastAPI dependency injection for auth
 - **Consideration**: Rate limiting and abuse prevention
 ## Maintenance Guidelines
 ### 1. Memory Bank Updates
 - **Trigger**: After significant changes or bug fixes
 - **Pattern**: Update activeContext.md and progress.md
 - **Purpose**: Maintain development continuity
 ### 2. Configuration Management
 - **Pattern**: Environment-based configuration with sensible defaults
 - **Implementation**: .env files with example templates
 - **Consideration**: Security for production deployments
 ### 3. Error Handling
 - **Pattern**: Comprehensive logging with appropriate levels
 - **Implementation**: Graceful degradation where possible
 - **Consideration**: User-friendly error messages
 ## Project Evolution Notes
 The project has evolved from experimental to production-ready status. Key milestones:
 - **Early 2025**: Basic RAG implementation
 - **Mid 2025**: Multiple storage backends and LLM providers
 - **July 2025**: Major query optimization and algorithm improvements
 - **August 2025**: Production-ready stable state
 The system now supports enterprise-level deployments with comprehensive functionality across all components.
--- a/.gitignore
+++ b/.gitignore
@ -73,4 +73,3 @@ test_*
 # Cline files
 memory-bank
 memory-bank/
 .clinerules
--- a/Agments.md
+++ b/Agments.md
@ -0,0 +1,108 @@
 # Project Guide for AI Agents
 This Agments.md file provides operational guidance for AI assistants collaborating on the LightRAG codebase. Use it to understand the repository layout, preferred tooling, and expectations for adding or modifying functionality.
 ## Core Purpose
 LightRAG is an advanced Retrieval-Augmented Generation (RAG) framework designed to enhance information retrieval and generation through graph-based knowledge representation. The project aims to provide a more intelligent and efficient way to process and retrieve information from documents by leveraging both graph structures and vector embeddings.
 ## Project Structure for Navigation
 - `/lightrag`: Core Python package (ingestion, querying, storage abstractions, utilities). Key modules include `lightrag/lightrag.py` orchestration, `operate.py` pipeline helpers, `kg/` storage backends, `llm/` bindings, and `utils*.py`.
 - `/lightrag/api`: FastAPI with Gunicorn for production. FastAPI service for LightRAG , auth, WebUI assets live in  `lightrag_server.py`. Routers live in `routers/`, shared helpers in `utils_api.py`. Gunicorn startup logic lives in `run_with_gunicorn.py`.
 - `/lightrag_webui`: React 19 + TypeScript + Tailwind front-end built with Vite/Bun. Uses component folders under `src/` and configuration via `env.*.sample`.
 - `/inputs`, `/rag_storage`, `/dickens`, `/temp`: data directories. Treat contents as mutable working data; avoid committing generated artefacts.
 - `/tests` and root-level `test_*.py`: Integration and smoke-test scripts (graph storage, API endpoints, behaviour regressions). Many expect specific environment variables or services.
 - `/docs`, `/k8s-deploy`, `docker-compose.yml`: Deployment notes, Kubernetes manifests, and container orchestration helpers.
 - Configuration templates: `.env.example`, `config.ini.example`, `lightrag.service.example`. Copy and adapt for local runs without committing secrets.
 ## Environment Setup and Tooling
 - Python 3.10 is required. Recommended bootstrap:
  ```bash
  # Development installation
  python -m venv .venv
  source .venv/bin/activate
  pip install -e .
  pip install -e .[api]
  # Start API server
  lightrag-server
  # Production deployment
  lightrag-gunicorn --workers 3
  ```
 - Duplicate `.env.example` to `.env` and adjust storage, LLM, and reranker bindings. Mirror `config.ini.example` when customising pipeline defaults.
 - Storage backends (PostgreSQL, Redis, Neo4j, Milvus, etc.) are selected via `LIGHTRAG_*` environment variables. Ensure connection URLs and credentials are in place before running ingestion or tests.
 - CLI entry points: `python -m lightrag` for package usage, `lightrag-server` (or `uvicorn lightrag.api.lightrag_server:app --reload`) for the API, `lightrag-gunicorn` for production gunicorn runs.
 - Front-end work: install dependencies with `bun install` (preferred) or `npm install`, then use `bunx --bun vite` commands defined in `package.json`.
 ## Frontend Development
 * **Package Manager**: **ALWAYS USE BUN** - Never use npm or yarn unless Bun is unavailable
  **Commands**:
  - `bun install` - Install dependencies
  - `bun run dev` - Start development server
  - `bun run build` - Build for production
  - `bun run lint` - Run linting
  - `bun test` - Run tests
  - `bun run preview` - Preview production build
 * **Pattern**: All frontend operations must use Bun commands
 * **Testing**: Use `bun test` for all frontend testing
 ## Coding Conventions
 - Embrace type hints, dataclasses, and asynchronous patterns already present in `lightrag/lightrag.py` and storage implementations. Keep long-running jobs within `asyncio` flows and reuse helpers from `lightrag.operate`.
 - Honour abstraction boundaries: new storage providers should inherit from the relevant base classes in `lightrag.base`; reusable logic belongs in `utils.py`/`utils_graph.py`.
 - Use `lightrag.utils.logger` (not bare `print`) and let environment toggles (`VERBOSE`, `LOG_LEVEL`) control verbosity.
 - Respect configuration defaults in `lightrag/constants.py`, extending with care and synchronising related documentation when behaviour changes.
 - API additions should live under `lightrag/api/routers`, leverage dependency injections from `utils_api.py`, and return structured responses consistent with existing handlers.
 - Front-end code should remain in TypeScript, rely on functional React components with hooks, and follow Tailwind utility style. Co-locate component-specific styles; reserve custom CSS for cases Tailwind cannot cover.
 - Storage Backends
  - **Default**: In-memory with file persistence
  - **Production Options**: PostgreSQL, MongoDB, Redis, Neo4j
  - **Pattern**: Abstract storage interface with multiple implementations
 * Lock Key Generation Consistency
  - **Critical Pattern**: Always sort parameters for lock key generation to prevent deadlocks
  - **Example**: `sorted_key_parts = sorted([src, tgt])` before creating lock key
  - **Why**: Prevents different lock keys for same relationship pair processed in different orders
  - **Apply to**: Any function that uses locks with multiple parameters
 * Priority Queue Implementation
  * **Pattern**: Use priority-based task queuing for LLM requests
  * **Benefits**: Critical operations get higher priority
  * **Implementation**: Lower priority values = higher priority
 ## Testing and Quality Gates
 - Run Python tests with `python -m pytest tests` for the FastAPI suite, and execute targeted scripts (for example `python tests/test_graph_storage.py`, `python test_lock_fix.py`) when touching related functionality. Many scripts require running backing services; check `.env` for prerequisites.
 - Perform linting via `ruff check .` (configured in `pyproject.toml`) and address warnings. For formatting, match the existing style rather than introducing new tools.
 - Front-end validation: `bun test`, `bunx --bun vite build`, and `bunx --bun vite lint`. The `*-no-bun` scripts exist if Bun is unavailable.
 - When touching deployment assets, ensure `docker-compose config` or relevant `kubectl` dry-runs succeed before submitting changes.
 ## Runtime and Operational Notes
 - Knowledge ingestion expects documents inside `inputs/` and writes intermediate state to `rag_storage/`. Keep these directories gitignored; never check in private data or large artefacts.
 - Use `operate.py` helpers (e.g., `chunking_by_token_size`, `extract_entities`) to keep ingestion behaviour consistent. If extending the pipeline, document new steps in `docs/` and update any affected CLI usage.
 - The API and core package rely on `.env`/`config.ini` being co-located with the current working directory. Scripts such as `tests/test_graph_storage.py` dynamically read these files; ensure they are in sync.
 ## Contribution Checklist
 1. Run `pre-commit run --all-files` before sumitting PR.
 2. Describe the change, affected modules, and operational impact in your PR. Mention any new environment knobs or storage dependencies.
 3. Link related issues or discussions when available.
 4. Confirm all applicable checks pass (`ruff`, pytest suite, targeted integration scripts, front-end build/tests when touched).
 5. Capture screenshots or GIFs for front-end or API changes that affect user-visible behaviour.
 6. Keep each PR focused on a single concern and update documentation (`README.md`, `docs/`, `.env.example`) when behaviour or configuration changes.
 Follow this playbook to keep LightRAG contributions predictable, testable, and production-ready.
--- a/env.example
+++ b/env.example
@ -310,6 +310,14 @@ POSTGRES_IVFFLAT_LISTS=100
 # POSTGRES_SSL_ROOT_CERT=/path/to/ca-cert.pem
 # POSTGRES_SSL_CRL=/path/to/crl.pem
 ### PostgreSQL Server Settings (for Supabase Supavisor)
 # Use this to pass extra options to the PostgreSQL connection string.
 # For Supabase, you might need to set it like this:
 # POSTGRES_SERVER_SETTINGS="options=reference%3D[project-ref]"
 # Default is 100 set to 0 to disable
 # POSTGRES_STATEMENT_CACHE_SIZE=100
 ### Neo4j Configuration
 NEO4J_URI=neo4j+s://xxxxxxxx.databases.neo4j.io
 NEO4J_USERNAME=neo4j
--- a/lightrag/kg/networkx_impl.py
+++ b/lightrag/kg/networkx_impl.py
@ -65,7 +65,7 @@ class NetworkXStorage(BaseGraphStorage):
            )
        else:
            logger.info(
-                f"[{self.workspace}] Created new empty graph fiel: {self._graphml_xml_file}"
+                f"[{self.workspace}] Created new empty graph file: {self._graphml_xml_file}"
            )
        self._graph = preloaded_graph or nx.Graph()
--- a/lightrag/kg/postgres_impl.py
+++ b/lightrag/kg/postgres_impl.py
@ -79,6 +79,12 @@ class PostgreSQLDB:
        self.hnsw_ef = config.get("hnsw_ef")
        self.ivfflat_lists = config.get("ivfflat_lists")
        # Server settings
        self.server_settings = config.get("server_settings")
        # Statement LRU cache size (keep as-is, allow None for optional configuration)
        self.statement_cache_size = config.get("statement_cache_size")
        if self.user is None or self.password is None or self.database is None:
            raise ValueError("Missing database user, password, or database")
@ -165,6 +171,15 @@ class PostgreSQLDB:
                "max_size": self.max,
            }
            # Only add statement_cache_size if it's configured
            if self.statement_cache_size is not None:
                connection_params["statement_cache_size"] = int(
                    self.statement_cache_size
                )
                logger.info(
                    f"PostgreSQL, statement LRU cache size set as: {self.statement_cache_size}"
                )
            # Add SSL configuration if provided
            ssl_context = self._create_ssl_context()
            if ssl_context is not None:
@ -178,6 +193,24 @@ class PostgreSQLDB:
                    connection_params["ssl"] = False
                logger.info(f"PostgreSQL, SSL mode set to: {self.ssl_mode}")
            # Add server settings if provided
            if self.server_settings:
                try:
                    settings = {}
                    # The format is expected to be a query string, e.g., "key1=value1&key2=value2"
                    pairs = self.server_settings.split("&")
                    for pair in pairs:
                        if "=" in pair:
                            key, value = pair.split("=", 1)
                            settings[key] = value
                    if settings:
                        connection_params["server_settings"] = settings
                        logger.info(f"PostgreSQL, Server settings applied: {settings}")
                except Exception as e:
                    logger.warning(
                        f"PostgreSQL, Failed to parse server_settings: {self.server_settings}, error: {e}"
                    )
            self.pool = await asyncpg.create_pool(**connection_params)  # type: ignore
            # Ensure VECTOR extension is available
@ -833,8 +866,8 @@ class PostgreSQLDB:
                    # Execute the migration
                    alter_sql = f"""
-                    ALTER TABLE {migration['table']}
+                    ALTER TABLE {migration["table"]}
-                    ALTER COLUMN {migration['column']} TYPE {migration['new_type']}
+                    ALTER COLUMN {migration["column"]} TYPE {migration["new_type"]}
                    """
                    await self.execute(alter_sql)
@ -1464,6 +1497,15 @@ class ClientManager:
                    config.get("postgres", "ivfflat_lists", fallback="100"),
                )
            ),
            # Server settings for Supabase
            "server_settings": os.environ.get(
                "POSTGRES_SERVER_SETTINGS",
                config.get("postgres", "server_options", fallback=None),
            ),
            "statement_cache_size": os.environ.get(
                "POSTGRES_STATEMENT_CACHE_SIZE",
                config.get("postgres", "statement_cache_size", fallback=None),
            ),
        }
    @classmethod
--- a/lightrag/llm/openai.py
+++ b/lightrag/llm/openai.py
@ -579,6 +579,7 @@ async def openai_embed(
    base_url: str | None = None,
    api_key: str | None = None,
    client_configs: dict[str, Any] | None = None,
    token_tracker: Any | None = None,
 ) -> np.ndarray:
    """Generate embeddings for a list of texts using OpenAI's API.
@ -590,6 +591,7 @@ async def openai_embed(
        client_configs: Additional configuration options for the AsyncOpenAI client.
            These will override any default configurations but will be overridden by
            explicit parameters (api_key, base_url).
        token_tracker: Optional token usage tracker for monitoring API usage.
    Returns:
        A numpy array of embeddings, one per input text.
@ -608,6 +610,14 @@ async def openai_embed(
        response = await openai_async_client.embeddings.create(
            model=model, input=texts, encoding_format="base64"
        )
        if token_tracker and hasattr(response, "usage"):
            token_counts = {
                "prompt_tokens": getattr(response.usage, "prompt_tokens", 0),
                "total_tokens": getattr(response.usage, "total_tokens", 0),
            }
            token_tracker.add_usage(token_counts)
        return np.array(
            [
                np.array(dp.embedding, dtype=np.float32)