- Add .clinerules with technical patterns - Create Agments.md for Codex agent guidance - Ensures consistent behavior across all team members
7.3 KiB
Project Guide for AI Agents
This Agments.md file provides operational guidance for AI assistants collaborating on the LightRAG codebase. Use it to understand the repository layout, preferred tooling, and expectations for adding or modifying functionality.
Core Purpose
LightRAG is an advanced Retrieval-Augmented Generation (RAG) framework designed to enhance information retrieval and generation through graph-based knowledge representation. The project aims to provide a more intelligent and efficient way to process and retrieve information from documents by leveraging both graph structures and vector embeddings.
Project Structure for Navigation
/lightrag: Core Python package (ingestion, querying, storage abstractions, utilities). Key modules includelightrag/lightrag.pyorchestration,operate.pypipeline helpers,kg/storage backends,llm/bindings, andutils*.py./lightrag/api: FastAPI with Gunicorn for production. FastAPI service for LightRAG , auth, WebUI assets live inlightrag_server.py. Routers live inrouters/, shared helpers inutils_api.py. Gunicorn startup logic lives inrun_with_gunicorn.py./lightrag_webui: React 19 + TypeScript + Tailwind front-end built with Vite/Bun. Uses component folders undersrc/and configuration viaenv.*.sample./inputs,/rag_storage,/dickens,/temp: data directories. Treat contents as mutable working data; avoid committing generated artefacts./testsand root-leveltest_*.py: Integration and smoke-test scripts (graph storage, API endpoints, behaviour regressions). Many expect specific environment variables or services./docs,/k8s-deploy,docker-compose.yml: Deployment notes, Kubernetes manifests, and container orchestration helpers.- Configuration templates:
.env.example,config.ini.example,lightrag.service.example. Copy and adapt for local runs without committing secrets.
Environment Setup and Tooling
-
Python 3.10 is required. Recommended bootstrap:
# Development installation python -m venv .venv source .venv/bin/activate pip install -e . pip install -e .[api] # Start API server lightrag-server # Production deployment lightrag-gunicorn --workers 3 -
Duplicate
.env.exampleto.envand adjust storage, LLM, and reranker bindings. Mirrorconfig.ini.examplewhen customising pipeline defaults. -
Storage backends (PostgreSQL, Redis, Neo4j, Milvus, etc.) are selected via
LIGHTRAG_*environment variables. Ensure connection URLs and credentials are in place before running ingestion or tests. -
CLI entry points:
python -m lightragfor package usage,lightrag-server(oruvicorn lightrag.api.lightrag_server:app --reload) for the API,lightrag-gunicornfor production gunicorn runs. -
Front-end work: install dependencies with
bun install(preferred) ornpm install, then usebunx --bun vitecommands defined inpackage.json.
Frontend Development
-
Package Manager: ALWAYS USE BUN - Never use npm or yarn unless Bun is unavailable Commands:
-
bun install- Install dependencies -
bun run dev- Start development server -
bun run build- Build for production -
bun run lint- Run linting -
bun test- Run tests -
bun run preview- Preview production build
-
-
Pattern: All frontend operations must use Bun commands
-
Testing: Use
bun testfor all frontend testing
Coding Conventions
- Embrace type hints, dataclasses, and asynchronous patterns already present in
lightrag/lightrag.pyand storage implementations. Keep long-running jobs withinasyncioflows and reuse helpers fromlightrag.operate. - Honour abstraction boundaries: new storage providers should inherit from the relevant base classes in
lightrag.base; reusable logic belongs inutils.py/utils_graph.py. - Use
lightrag.utils.logger(not bareprint) and let environment toggles (VERBOSE,LOG_LEVEL) control verbosity. - Respect configuration defaults in
lightrag/constants.py, extending with care and synchronising related documentation when behaviour changes. - API additions should live under
lightrag/api/routers, leverage dependency injections fromutils_api.py, and return structured responses consistent with existing handlers. - Front-end code should remain in TypeScript, rely on functional React components with hooks, and follow Tailwind utility style. Co-locate component-specific styles; reserve custom CSS for cases Tailwind cannot cover.
- Storage Backends
- Default: In-memory with file persistence
- Production Options: PostgreSQL, MongoDB, Redis, Neo4j
- Pattern: Abstract storage interface with multiple implementations
- Lock Key Generation Consistency
- Critical Pattern: Always sort parameters for lock key generation to prevent deadlocks
- Example:
sorted_key_parts = sorted([src, tgt])before creating lock key - Why: Prevents different lock keys for same relationship pair processed in different orders
- Apply to: Any function that uses locks with multiple parameters
- Priority Queue Implementation
- Pattern: Use priority-based task queuing for LLM requests
- Benefits: Critical operations get higher priority
- Implementation: Lower priority values = higher priority
Testing and Quality Gates
- Run Python tests with
python -m pytest testsfor the FastAPI suite, and execute targeted scripts (for examplepython tests/test_graph_storage.py,python test_lock_fix.py) when touching related functionality. Many scripts require running backing services; check.envfor prerequisites. - Perform linting via
ruff check .(configured inpyproject.toml) and address warnings. For formatting, match the existing style rather than introducing new tools. - Front-end validation:
bun test,bunx --bun vite build, andbunx --bun vite lint. The*-no-bunscripts exist if Bun is unavailable. - When touching deployment assets, ensure
docker-compose configor relevantkubectldry-runs succeed before submitting changes.
Runtime and Operational Notes
- Knowledge ingestion expects documents inside
inputs/and writes intermediate state torag_storage/. Keep these directories gitignored; never check in private data or large artefacts. - Use
operate.pyhelpers (e.g.,chunking_by_token_size,extract_entities) to keep ingestion behaviour consistent. If extending the pipeline, document new steps indocs/and update any affected CLI usage. - The API and core package rely on
.env/config.inibeing co-located with the current working directory. Scripts such astests/test_graph_storage.pydynamically read these files; ensure they are in sync.
Contribution Checklist
- Run
pre-commit run --all-filesbefore sumitting PR. - Describe the change, affected modules, and operational impact in your PR. Mention any new environment knobs or storage dependencies.
- Link related issues or discussions when available.
- Confirm all applicable checks pass (
ruff, pytest suite, targeted integration scripts, front-end build/tests when touched). - Capture screenshots or GIFs for front-end or API changes that affect user-visible behaviour.
- Keep each PR focused on a single concern and update documentation (
README.md,docs/,.env.example) when behaviour or configuration changes.
Follow this playbook to keep LightRAG contributions predictable, testable, and production-ready.