Remove auto-scan-at-startup feature and related documentation

• Remove --auto-scan-at-startup arg • Delete auto scan docs sections • Remove startup scanning logic
2025-09-23 16:24:53 +08:00 · 2025-09-23 16:24:53 +08:00 · 6b953fa53d
commit 6b953fa53d
parent fc15e9f142
4 changed files with 0 additions and 54 deletions
--- a/lightrag/api/README-zh.md
+++ b/lightrag/api/README-zh.md
@ -140,18 +140,6 @@ docker compose up
 ```
 > 可以通过以下链接获取官方的docker compose文件：[docker-compose.yml]( https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml) 。如需获取LightRAG的历史版本镜像，可以访问以下链接: [LightRAG Docker Images]( https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)

-### 启动时自动扫描
-
-当使用 `--auto-scan-at-startup` 参数启动LightRAG Server时，系统将自动：
-
-1. 扫描输入目录中的新文件
-2. 为尚未在数据库中的新文档建立索引
-3. 使所有内容立即可用于 RAG 查询
-
-这种工作模式给启动一个临时的RAG任务提供给了方便。
-
-> `--input-dir` 参数指定要扫描的输入目录。您可以从 webui 触发输入目录扫描。
-
 ### 启动多个LightRAG实例

 有两种方式可以启动多个LightRAG实例。第一种方式是为每个实例配置一个完全独立的工作环境。此时需要为每个实例创建一个独立的工作目录，然后在这个工作目录上放置一个当前实例专用的`.env`配置文件。不同实例的配置文件中的服务器监听端口不能重复，然后在工作目录上执行 lightrag-server 启动服务即可。
@ -432,7 +420,6 @@ LIGHTRAG_DOC_STATUS_STORAGE=PGDocStatusStorage
 | --ssl-keyfile | None | SSL 私钥文件路径（如果启用 --ssl 则必需） |
 | --llm-binding | ollama | LLM 绑定类型（lollms、ollama、openai、openai-ollama、azure_openai、aws_bedrock） |
 | --embedding-binding | ollama | 嵌入绑定类型（lollms、ollama、openai、azure_openai、aws_bedrock） |
-| auto-scan-at-startup | - | 扫描输入目录中的新文件并开始索引 |

 ### Reranking 配置

--- a/lightrag/api/README.md
+++ b/lightrag/api/README.md
@ -143,18 +143,6 @@ docker compose up

 > You can get the official docker compose file from here: [docker-compose.yml](https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/main/docker-compose.yml). For historical versions of LightRAG docker images, visit this link: [LightRAG Docker Images](https://github.com/HKUDS/LightRAG/pkgs/container/lightrag)

-### Auto scan on startup
-
-When starting the LightRAG Server with the `--auto-scan-at-startup` parameter, the system will automatically:
-
-1. Scan for new files in the input directory
-2. Index new documents that aren't already in the database
-3. Make all content immediately available for RAG queries
-
-This offers an efficient method for deploying ad-hoc RAG processes.
-
-> The `--input-dir` parameter specifies the input directory to scan. You can trigger the input directory scan from the Web UI.
-
 ### Starting Multiple LightRAG Instances

 There are two ways to start multiple LightRAG instances. The first way is to configure a completely independent working environment for each instance. This requires creating a separate working directory for each instance and placing a dedicated `.env` configuration file in that directory. The server listening ports in the configuration files of different instances cannot be the same. Then, you can start the service by running `lightrag-server` in the working directory.
@ -434,7 +422,6 @@ You cannot change storage implementation selection after adding documents to Lig
 | --ssl-keyfile         | None          | Path to SSL private key file (required if --ssl is enabled)                                                                     |
 | --llm-binding         | ollama        | LLM binding type (lollms, ollama, openai, openai-ollama, azure_openai, aws_bedrock)                                                          |
 | --embedding-binding   | ollama        | Embedding binding type (lollms, ollama, openai, azure_openai, aws_bedrock)                                                                   |
-| --auto-scan-at-startup| -             | Scan input directory for new files and start indexing                                                                           |

 ### Reranking Configuration

--- a/lightrag/api/config.py
+++ b/lightrag/api/config.py
@ -206,13 +206,6 @@ def parse_args() -> argparse.Namespace:
        help="Default workspace for all storage",
    )

-    parser.add_argument(
-        "--auto-scan-at-startup",
-        action="store_true",
-        default=False,
-        help="Enable automatic scanning when the program starts",
-    )
-
    # Server workers configuration
    parser.add_argument(
        "--workers",
--- a/lightrag/api/lightrag_server.py
+++ b/lightrag/api/lightrag_server.py
@ -3,7 +3,6 @@ LightRAG FastAPI Server
 """

 from fastapi import FastAPI, Depends, HTTPException
-import asyncio
 import os
 import logging
 import logging.config
@ -45,7 +44,6 @@ from lightrag.constants import (
 from lightrag.api.routers.document_routes import (
    DocumentManager,
    create_document_routes,
-    run_scanning_process,
 )
 from lightrag.api.routers.query_routes import create_query_routes
 from lightrag.api.routers.graph_routes import create_graph_routes
@ -54,7 +52,6 @@ from lightrag.api.routers.ollama_api import OllamaAPI
 from lightrag.utils import logger, set_verbose_debug
 from lightrag.kg.shared_storage import (
    get_namespace_data,
-    get_pipeline_status_lock,
    initialize_pipeline_status,
    cleanup_keyed_lock,
    finalize_share_data,
@ -212,24 +209,6 @@ def create_app(args):
            # Data migration regardless of storage implementation
            await rag.check_and_migrate_data()

-            pipeline_status = await get_namespace_data("pipeline_status")
-
-            should_start_autoscan = False
-            async with get_pipeline_status_lock():
-                # Auto scan documents if enabled
-                if args.auto_scan_at_startup:
-                    if not pipeline_status.get("autoscanned", False):
-                        pipeline_status["autoscanned"] = True
-                        should_start_autoscan = True
-
-            # Only run auto scan when no other process started it first
-            if should_start_autoscan:
-                # Create background task
-                task = asyncio.create_task(run_scanning_process(rag, doc_manager))
-                app.state.background_tasks.add(task)
-                task.add_done_callback(app.state.background_tasks.discard)
-                logger.info(f"Process {os.getpid()} auto scan task started at startup.")
-
            ASCIIColors.green("\nServer is ready to accept connections! 🚀\n")

            yield