cherry-pick 9d69e8d7

2025-12-04 19:19:00 +08:00 · 2025-12-04 19:19:00 +08:00 · cf0899c063
commit cf0899c063
parent c9abde6979
3 changed files with 143 additions and 105 deletions
--- a/lightrag/api/README.md
+++ b/lightrag/api/README.md
@ -29,15 +29,15 @@ cd lightrag

 # Create a Python virtual environment
 uv venv --seed --python 3.12
-source .venv/bin/acivate
+source .venv/bin/activate

 # Install in editable mode with API support
 pip install -e ".[api]"

 # Build front-end artifacts
 cd lightrag_webui
-bun install --frozen-lockfile --production
-bun run build --emptyOutDir
+bun install --frozen-lockfile
+bun run build
 cd ..
 ```

@ -119,37 +119,13 @@ During startup, configurations in the `.env` file can be overridden by command-l

 ### Launching LightRAG Server with Docker

-* Prepare the .env file:
-    Create a personalized .env file by copying the sample file [`env.example`](env.example). Configure the LLM and embedding parameters according to your requirements.
+Using Docker Compose is the most convenient way to deploy and run the LightRAG Server.

-* Create a file named `docker-compose.yml`:
+* Create a project directory.

-```yaml
-services:
-  lightrag:
-    container_name: lightrag
-    image: ghcr.io/hkuds/lightrag:latest
-    build:
-      context: .
-      dockerfile: Dockerfile
-      tags:
-        - ghcr.io/hkuds/lightrag:latest
-    ports:
-      - "${PORT:-9621}:9621"
-    volumes:
-      - ./data/rag_storage:/app/data/rag_storage
-      - ./data/inputs:/app/data/inputs
-      - ./data/tiktoken:/app/data/tiktoken
-      - ./config.ini:/app/config.ini
-      - ./.env:/app/.env
-    env_file:
-      - .env
-    environment:
-      - TIKTOKEN_CACHE_DIR=/app/data/tiktoken
-    restart: unless-stopped
-    extra_hosts:
-      - "host.docker.internal:host-gateway"
-```
+* Copy the `docker-compose.yml` file from the LightRAG repository into your project directory.
+
+* Prepare the `.env` file: Duplicate the sample file [`env.example`](https://ai.znipower.com:5013/c/env.example)to create a customized `.env` file, and configure the LLM and embedding parameters according to your specific requirements.

 * Start the LightRAG Server with the following command:

@ -158,11 +134,11 @@ docker compose up
 # If you want the program to run in the background after startup, add the -d parameter at the end of the command.
 ```

-> You can get the official docker compose file from here: [docker-compose.yml](https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml). For historical versions of LightRAG docker images, visit this link: [LightRAG Docker Images](https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)
+You can get the official docker compose file from here: [docker-compose.yml](https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml). For historical versions of LightRAG docker images, visit this link: [LightRAG Docker Images](https://github.com/HKUDS/LightRAG/pkgs/container/lightrag). For more details about docker deployment, please refer to [DockerDeployment.md](./../../docs/DockerDeployment.md).

 ### Offline Deployment

- For offline or air-gapped environments, see the [Offline Deployment Guide](./../../docs/OfflineDeployment.md) for instructions on pre-installing all dependencies and cache files.
+Official LightRAG Docker images are fully compatible with offline or air-gapped environments. If you want to build up you own  offline enviroment, please refer to [Offline Deployment Guide](./../../docs/OfflineDeployment.md).

 ### Starting Multiple LightRAG Instances

@ -189,7 +165,8 @@ Configuring an independent working directory and a dedicated `.env` configuratio
 The command-line `workspace` argument and the `WORKSPACE` environment variable in the `.env` file can both be used to specify the workspace name for the current instance, with the command-line argument having higher priority. Here is how workspaces are implemented for different types of storage:

 - **For local file-based databases, data isolation is achieved through workspace subdirectories:** `JsonKVStorage`, `JsonDocStatusStorage`, `NetworkXStorage`, `NanoVectorDBStorage`, `FaissVectorDBStorage`.
- **For databases that store data in collections, it's done by adding a workspace prefix to the collection name:** `RedisKVStorage`, `RedisDocStatusStorage`, `MilvusVectorDBStorage`, `QdrantVectorDBStorage`, `MongoKVStorage`, `MongoDocStatusStorage`, `MongoVectorDBStorage`, `MongoGraphStorage`, `PGGraphStorage`.
+- **For databases that store data in collections, it's done by adding a workspace prefix to the collection name:** `RedisKVStorage`, `RedisDocStatusStorage`, `MilvusVectorDBStorage`, `MongoKVStorage`, `MongoDocStatusStorage`, `MongoVectorDBStorage`, `MongoGraphStorage`, `PGGraphStorage`.
+- **For Qdrant vector database, data isolation is achieved through payload-based partitioning (Qdrant's recommended multitenancy approach):** `QdrantVectorDBStorage` uses shared collections with payload filtering for unlimited workspace scalability.
 - **For relational databases, data isolation is achieved by adding a `workspace` field to the tables for logical data separation:** `PGKVStorage`, `PGVectorStorage`, `PGDocStatusStorage`.
 - **For graph databases, logical data isolation is achieved through labels:** `Neo4JStorage`, `MemgraphStorage`

@ -486,6 +463,59 @@ The `/query` and `/query/stream` API endpoints include an `enable_rerank` parame
 RERANK_BY_DEFAULT=False
 ```

+### Include Chunk Content in References
+
+By default, the `/query` and `/query/stream` endpoints return references with only `reference_id` and `file_path`. For evaluation, debugging, or citation purposes, you can request the actual retrieved chunk content to be included in references.
+
+The `include_chunk_content` parameter (default: `false`) controls whether the actual text content of retrieved chunks is included in the response references. This is particularly useful for:
+
+- **RAG Evaluation**: Testing systems like RAGAS that need access to retrieved contexts
+- **Debugging**: Verifying what content was actually used to generate the answer
+- **Citation Display**: Showing users the exact text passages that support the response
+- **Transparency**: Providing full visibility into the RAG retrieval process
+
+**Important**: The `content` field is an **array of strings**, where each string represents a chunk from the same file. A single file may correspond to multiple chunks, so the content is returned as a list to preserve chunk boundaries.
+
+**Example API Request:**
+
+```json
+{
+  "query": "What is LightRAG?",
+  "mode": "mix",
+  "include_references": true,
+  "include_chunk_content": true
+}
+```
+
+**Example Response (with chunk content):**
+
+```json
+{
+  "response": "LightRAG is a graph-based RAG system...",
+  "references": [
+    {
+      "reference_id": "1",
+      "file_path": "/documents/intro.md",
+      "content": [
+        "LightRAG is a retrieval-augmented generation system that combines knowledge graphs with vector similarity search...",
+        "The system uses a dual-indexing approach with both vector embeddings and graph structures for enhanced retrieval..."
+      ]
+    },
+    {
+      "reference_id": "2",
+      "file_path": "/documents/features.md",
+      "content": [
+        "The system provides multiple query modes including local, global, hybrid, and mix modes..."
+      ]
+    }
+  ]
+}
+```
+
+**Notes**:
+- This parameter only works when `include_references=true`. Setting `include_chunk_content=true` without including references has no effect.
+- **Breaking Change**: Prior versions returned `content` as a single concatenated string. Now it returns an array of strings to preserve individual chunk boundaries. If you need a single string, join the array elements with your preferred separator (e.g., `"\n\n".join(content)`).
+
 ### .env Examples

 ```bash
--- a/lightrag/api/routers/query_routes.py
+++ b/lightrag/api/routers/query_routes.py
@ -4,17 +4,15 @@ This module contains all query-related routes for the LightRAG API.

 import json
 import logging
-from typing import Any, Dict, List, Literal, Optional
+from typing import Any, Dict, List, Literal, Optional, Union

 from fastapi import APIRouter, Depends, HTTPException
-from lightrag import LightRAG
 from lightrag.base import QueryParam
 from lightrag.api.utils_api import get_combined_auth_dependency
-from lightrag.api.dependencies import get_tenant_context_optional
-from lightrag.models.tenant import TenantContext
-from lightrag.tenant_rag_manager import TenantRAGManager
 from pydantic import BaseModel, Field, field_validator

+from ascii_colors import trace_exception
+
 router = APIRouter(tags=["query"])


@ -152,9 +150,9 @@ class QueryResponse(BaseModel):
    response: str = Field(
        description="The generated response",
    )
-    references: Optional[List[Dict[str, str]]] = Field(
+    references: Optional[List[Dict[str, Union[str, List[str]]]]] = Field(
        default=None,
-        description="Reference list (Disabled when include_references=False, /query/data always includes references.)",
+        description="Reference list (Disabled when include_references=False, /query/data always includes references.). The 'content' field in each reference is a list of strings when include_chunk_content=True.",
    )


@ -184,38 +182,8 @@ class StreamChunkResponse(BaseModel):
    )


-def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag_manager: Optional[TenantRAGManager] = None):
+def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60):
    combined_auth = get_combined_auth_dependency(api_key)
-    
-    # SEC-001 FIX: Check strict mode configuration
-    try:
-        from lightrag.api.config import MULTI_TENANT_STRICT_MODE
-        strict_mode = MULTI_TENANT_STRICT_MODE
-    except ImportError:
-        strict_mode = False
-    
-    async def get_tenant_rag(tenant_context: Optional[TenantContext] = Depends(get_tenant_context_optional)) -> LightRAG:
-        """Dependency to get tenant-specific RAG instance for query operations.
-        
-        In strict multi-tenant mode, raises error if tenant context is missing.
-        In non-strict mode, falls back to global RAG for backward compatibility.
-        """
-        if rag_manager and tenant_context and tenant_context.tenant_id and tenant_context.kb_id:
-            return await rag_manager.get_rag_instance(
-                tenant_context.tenant_id, 
-                tenant_context.kb_id,
-                tenant_context.user_id  # Pass user_id for security validation
-            )
-        
-        # SEC-001 FIX: In strict mode, don't allow fallback to global RAG
-        if strict_mode and rag_manager:
-            from fastapi import HTTPException
-            raise HTTPException(
-                status_code=400,
-                detail="Tenant context required. Provide X-Tenant-ID and X-KB-ID headers."
-            )
-        
-        return rag

    @router.post(
        "/query",
@ -240,6 +208,11 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
                                        "properties": {
                                            "reference_id": {"type": "string"},
                                            "file_path": {"type": "string"},
+                                            "content": {
+                                                "type": "array",
+                                                "items": {"type": "string"},
+                                                "description": "List of chunk contents from this file (only included when include_chunk_content=True)",
+                                            },
                                        },
                                    },
                                    "description": "Reference list (only included when include_references=True)",
@ -267,19 +240,24 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
                            },
                            "with_chunk_content": {
                                "summary": "Response with chunk content",
-                                "description": "Example response when include_references=True and include_chunk_content=True",
+                                "description": "Example response when include_references=True and include_chunk_content=True. Note: content is an array of chunks from the same file.",
                                "value": {
                                    "response": "Artificial Intelligence (AI) is a branch of computer science that aims to create intelligent machines capable of performing tasks that typically require human intelligence, such as learning, reasoning, and problem-solving.",
                                    "references": [
                                        {
                                            "reference_id": "1",
                                            "file_path": "/documents/ai_overview.pdf",
-                                            "content": "Artificial Intelligence (AI) represents a transformative field in computer science focused on creating systems that can perform tasks requiring human-like intelligence. These tasks include learning from experience, understanding natural language, recognizing patterns, and making decisions.",
+                                            "content": [
+                                                "Artificial Intelligence (AI) represents a transformative field in computer science focused on creating systems that can perform tasks requiring human-like intelligence. These tasks include learning from experience, understanding natural language, recognizing patterns, and making decisions.",
+                                                "AI systems can be categorized into narrow AI, which is designed for specific tasks, and general AI, which aims to match human cognitive abilities across a wide range of domains.",
+                                            ],
                                        },
                                        {
                                            "reference_id": "2",
                                            "file_path": "/documents/machine_learning.txt",
-                                            "content": "Machine learning is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. It focuses on the development of algorithms that can access data and use it to learn for themselves.",
+                                            "content": [
+                                                "Machine learning is a subset of AI that enables computers to learn and improve from experience without being explicitly programmed. It focuses on the development of algorithms that can access data and use it to learn for themselves."
+                                            ],
                                        },
                                    ],
                                },
@ -336,11 +314,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
            },
        },
    )
-    async def query_text(
-        request: QueryRequest,
-        tenant_context: Optional[TenantContext] = Depends(get_tenant_context_optional),
-        tenant_rag: LightRAG = Depends(get_tenant_rag)
-    ):
+    async def query_text(request: QueryRequest):
        """
        Comprehensive RAG query endpoint with non-streaming response. Parameter "stream" is ignored.

@ -427,7 +401,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
            param.stream = False

            # Unified approach: always use aquery_llm for both cases
-            result = await tenant_rag.aquery_llm(request.query, param=param)
+            result = await rag.aquery_llm(request.query, param=param)

            # Extract LLM response and references from unified result
            llm_response = result.get("llm_response", {})
@ -448,11 +422,8 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
                    ref_id = chunk.get("reference_id", "")
                    content = chunk.get("content", "")
                    if ref_id and content:
-                        # If multiple chunks have same reference_id, concatenate
-                        if ref_id in ref_id_to_content:
-                            ref_id_to_content[ref_id] += "\n\n" + content
-                        else:
-                            ref_id_to_content[ref_id] = content
+                        # Collect chunk content; join later to avoid quadratic string concatenation
+                        ref_id_to_content.setdefault(ref_id, []).append(content)

                # Add content to references
                enriched_references = []
@ -460,6 +431,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
                    ref_copy = ref.copy()
                    ref_id = ref.get("reference_id", "")
                    if ref_id in ref_id_to_content:
+                        # Keep content as a list of chunks (one file may have multiple chunks)
                        ref_copy["content"] = ref_id_to_content[ref_id]
                    enriched_references.append(ref_copy)
                references = enriched_references
@ -470,6 +442,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
            else:
                return QueryResponse(response=response_content, references=None)
        except Exception as e:
+            trace_exception(e)
            raise HTTPException(status_code=500, detail=str(e))

    @router.post(
@ -492,6 +465,11 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
                                "description": "Multiple NDJSON lines when stream=True and include_references=True. First line contains references, subsequent lines contain response chunks.",
                                "value": '{"references": [{"reference_id": "1", "file_path": "/documents/ai_overview.pdf"}, {"reference_id": "2", "file_path": "/documents/ml_basics.txt"}]}\n{"response": "Artificial Intelligence (AI) is a branch of computer science"}\n{"response": " that aims to create intelligent machines capable of performing"}\n{"response": " tasks that typically require human intelligence, such as learning,"}\n{"response": " reasoning, and problem-solving."}',
                            },
+                            "streaming_with_chunk_content": {
+                                "summary": "Streaming mode with chunk content (stream=true, include_chunk_content=true)",
+                                "description": "Multiple NDJSON lines when stream=True, include_references=True, and include_chunk_content=True. First line contains references with content arrays (one file may have multiple chunks), subsequent lines contain response chunks.",
+                                "value": '{"references": [{"reference_id": "1", "file_path": "/documents/ai_overview.pdf", "content": ["Artificial Intelligence (AI) represents a transformative field...", "AI systems can be categorized into narrow AI and general AI..."]}, {"reference_id": "2", "file_path": "/documents/ml_basics.txt", "content": ["Machine learning is a subset of AI that enables computers to learn..."]}]}\n{"response": "Artificial Intelligence (AI) is a branch of computer science"}\n{"response": " that aims to create intelligent machines capable of performing"}\n{"response": " tasks that typically require human intelligence."}',
+                            },
                            "streaming_without_references": {
                                "summary": "Streaming mode without references (stream=true)",
                                "description": "Multiple NDJSON lines when stream=True and include_references=False. Only response chunks are sent.",
@ -546,11 +524,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
            },
        },
    )
-    async def query_text_stream(
-        request: QueryRequest,
-        tenant_context: Optional[TenantContext] = Depends(get_tenant_context_optional),
-        tenant_rag: LightRAG = Depends(get_tenant_rag)
-    ):
+    async def query_text_stream(request: QueryRequest):
        """
        Advanced RAG query endpoint with flexible streaming response.

@ -685,13 +659,37 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
            from fastapi.responses import StreamingResponse

            # Unified approach: always use aquery_llm for all cases
-            result = await tenant_rag.aquery_llm(request.query, param=param)
+            result = await rag.aquery_llm(request.query, param=param)

            async def stream_generator():
                # Extract references and LLM response from unified result
                references = result.get("data", {}).get("references", [])
                llm_response = result.get("llm_response", {})

+                # Enrich references with chunk content if requested
+                if request.include_references and request.include_chunk_content:
+                    data = result.get("data", {})
+                    chunks = data.get("chunks", [])
+                    # Create a mapping from reference_id to chunk content
+                    ref_id_to_content = {}
+                    for chunk in chunks:
+                        ref_id = chunk.get("reference_id", "")
+                        content = chunk.get("content", "")
+                        if ref_id and content:
+                            # Collect chunk content
+                            ref_id_to_content.setdefault(ref_id, []).append(content)
+
+                    # Add content to references
+                    enriched_references = []
+                    for ref in references:
+                        ref_copy = ref.copy()
+                        ref_id = ref.get("reference_id", "")
+                        if ref_id in ref_id_to_content:
+                            # Keep content as a list of chunks (one file may have multiple chunks)
+                            ref_copy["content"] = ref_id_to_content[ref_id]
+                        enriched_references.append(ref_copy)
+                    references = enriched_references
+
                if llm_response.get("is_streaming"):
                    # Streaming mode: send references first, then stream response chunks
                    if request.include_references:
@ -730,6 +728,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
                },
            )
        except Exception as e:
+            trace_exception(e)
            raise HTTPException(status_code=500, detail=str(e))

    @router.post(
@ -1028,11 +1027,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
            },
        },
    )
-    async def query_data(
-        request: QueryRequest,
-        tenant_context: Optional[TenantContext] = Depends(get_tenant_context_optional),
-        tenant_rag: LightRAG = Depends(get_tenant_rag)
-    ):
+    async def query_data(request: QueryRequest):
        """
        Advanced data retrieval endpoint for structured RAG analysis.

@ -1137,7 +1132,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
        """
        try:
            param = request.to_query_params(False)  # No streaming for data endpoint
-            response = await tenant_rag.aquery_data(request.query, param=param)
+            response = await rag.aquery_data(request.query, param=param)

            # aquery_data returns the new format with status, message, data, and metadata
            if isinstance(response, dict):
@ -1150,6 +1145,7 @@ def create_query_routes(rag, api_key: Optional[str] = None, top_k: int = 60, rag
                    data={},
                )
        except Exception as e:
+            trace_exception(e)
            raise HTTPException(status_code=500, detail=str(e))

    return router
--- a/lightrag/evaluation/eval_rag_quality.py
+++ b/lightrag/evaluation/eval_rag_quality.py
@ -170,14 +170,26 @@ class RAGEvaluator:
                first_ref = references[0]
                logger.debug("🔍 First Reference Keys: %s", list(first_ref.keys()))
                if "content" in first_ref:
-                    logger.debug(
-                        "🔍 Content Preview: %s...", first_ref["content"][:100]
-                    )
+                    content_preview = first_ref["content"]
+                    if isinstance(content_preview, list) and content_preview:
+                        logger.debug(
+                            "🔍 Content Preview (first chunk): %s...",
+                            content_preview[0][:100],
+                        )
+                    elif isinstance(content_preview, str):
+                        logger.debug("🔍 Content Preview: %s...", content_preview[:100])

            # Extract chunk content from enriched references
-            contexts = [
-                ref.get("content", "") for ref in references if ref.get("content")
-            ]
+            # Note: content is now a list of chunks per reference (one file may have multiple chunks)
+            contexts = []
+            for ref in references:
+                content = ref.get("content", [])
+                if isinstance(content, list):
+                    # Flatten the list: each chunk becomes a separate context
+                    contexts.extend(content)
+                elif isinstance(content, str):
+                    # Backward compatibility: if content is still a string (shouldn't happen)
+                    contexts.append(content)

            return {
                "answer": answer,