LightRAG/reverse_documentation/08-apache-age-analysis.md
Raphael MANSUY fe9b8ec02a
tests: stabilize integration tests + skip external services; fix multi-tenant API behavior and idempotency (#4)
* feat: Implement multi-tenant architecture with tenant and knowledge base models

- Added data models for tenants, knowledge bases, and related configurations.
- Introduced role and permission management for users in the multi-tenant system.
- Created a service layer for managing tenants and knowledge bases, including CRUD operations.
- Developed a tenant-aware instance manager for LightRAG with caching and isolation features.
- Added a migration script to transition existing workspace-based deployments to the new multi-tenant architecture.

* chore: ignore lightrag/api/webui/assets/ directory

* chore: stop tracking lightrag/api/webui/assets (ignore in .gitignore)

* feat: Initialize LightRAG Multi-Tenant Stack with PostgreSQL

- Added README.md for project overview, setup instructions, and architecture details.
- Created docker-compose.yml to define services: PostgreSQL, Redis, LightRAG API, and Web UI.
- Introduced env.example for environment variable configuration.
- Implemented init-postgres.sql for PostgreSQL schema initialization with multi-tenant support.
- Added reproduce_issue.py for testing default tenant access via API.

* feat: Enhance TenantSelector and update related components for improved multi-tenant support

* feat: Enhance testing capabilities and update documentation

- Updated Makefile to include new test commands for various modes (compatibility, isolation, multi-tenant, security, coverage, and dry-run).
- Modified API health check endpoint in Makefile to reflect new port configuration.
- Updated QUICK_START.md and README.md to reflect changes in service URLs and ports.
- Added environment variables for testing modes in env.example.
- Introduced run_all_tests.sh script to automate testing across different modes.
- Created conftest.py for pytest configuration, including database fixtures and mock services.
- Implemented database helper functions for streamlined database operations in tests.
- Added test collection hooks to skip tests based on the current MULTITENANT_MODE.

* feat: Implement multi-tenant support with demo mode enabled by default

- Added multi-tenant configuration to the environment and Docker setup.
- Created pre-configured demo tenants (acme-corp and techstart) for testing.
- Updated API endpoints to support tenant-specific data access.
- Enhanced Makefile commands for better service management and database operations.
- Introduced user-tenant membership system with role-based access control.
- Added comprehensive documentation for multi-tenant setup and usage.
- Fixed issues with document visibility in multi-tenant environments.
- Implemented necessary database migrations for user memberships and legacy support.

* feat(audit): Add final audit report for multi-tenant implementation

- Documented overall assessment, architecture overview, test results, security findings, and recommendations.
- Included detailed findings on critical security issues and architectural concerns.

fix(security): Implement security fixes based on audit findings

- Removed global RAG fallback and enforced strict tenant context.
- Configured super-admin access and required user authentication for tenant access.
- Cleared localStorage on logout and improved error handling in WebUI.

chore(logs): Create task logs for audit and security fixes implementation

- Documented actions, decisions, and next steps for both audit and security fixes.
- Summarized test results and remaining recommendations.

chore(scripts): Enhance development stack management scripts

- Added scripts for cleaning, starting, and stopping the development stack.
- Improved output messages and ensured graceful shutdown of services.

feat(starter): Initialize PostgreSQL with AGE extension support

- Created initialization scripts for PostgreSQL extensions including uuid-ossp, vector, and AGE.
- Ensured successful installation and verification of extensions.

* feat: Implement auto-select for first tenant and KB on initial load in WebUI

- Removed WEBUI_INITIAL_STATE_FIX.md as the issue is resolved.
- Added useTenantInitialization hook to automatically select the first available tenant and KB on app load.
- Integrated the new hook into the Root component of the WebUI.
- Updated RetrievalTesting component to ensure a KB is selected before allowing user interaction.
- Created end-to-end tests for multi-tenant isolation and real service interactions.
- Added scripts for starting, stopping, and cleaning the development stack.
- Enhanced API and tenant routes to support tenant-specific pipeline status initialization.
- Updated constants for backend URL to reflect the correct port.
- Improved error handling and logging in various components.

* feat: Add multi-tenant support with enhanced E2E testing scripts and client functionality

* update client

* Add integration and unit tests for multi-tenant API, models, security, and storage

- Implement integration tests for tenant and knowledge base management endpoints in `test_tenant_api_routes.py`.
- Create unit tests for tenant isolation, model validation, and role permissions in `test_tenant_models.py`.
- Add security tests to enforce role-based permissions and context validation in `test_tenant_security.py`.
- Develop tests for tenant-aware storage operations and context isolation in `test_tenant_storage_phase3.py`.

* feat(e2e): Implement OpenAI model support and database reset functionality

* Add comprehensive test suite for gpt-5-nano compatibility

- Introduced tests for parameter normalization, embeddings, and entity extraction.
- Implemented direct API testing for gpt-5-nano.
- Validated .env configuration loading and OpenAI API connectivity.
- Analyzed reasoning token overhead with various token limits.
- Documented test procedures and expected outcomes in README files.
- Ensured all tests pass for production readiness.

* kg(postgres_impl): ensure AGE extension is loaded in session and configure graph initialization

* dev: add hybrid dev helper scripts, Makefile, docker-compose.dev-db and local development docs

* feat(dev): add dev helper scripts and local development documentation for hybrid setup

* feat(multi-tenant): add detailed specifications and logs for multi-tenant improvements, including UX, backend handling, and ingestion pipeline

* feat(migration): add generated tenant/kb columns, indexes, triggers; drop unused tables; update schema and docs

* test(backward-compat): adapt tests to new StorageNameSpace/TenantService APIs (use concrete dummy storages)

* chore: multi-tenant and UX updates — docs, webui, storage, tenant service adjustments

* tests: stabilize integration tests + skip external services; fix multi-tenant API behavior and idempotency

- gpt5_nano_compatibility: add pytest-asyncio markers, skip when OPENAI key missing, prevent module-level asyncio.run collection, add conftest
- Ollama tests: add server availability check and skip markers; avoid pytest collection warnings by renaming helper classes
- Graph storage tests: rename interactive test functions to avoid pytest collection
- Document & Tenant routes: support external_ids for idempotency; ensure HTTPExceptions are re-raised
- LightRAG core: support external_ids in apipeline_enqueue_documents and idempotent logic
- Tests updated to match API changes (tenant routes & document routes)
- Add logs and scripts for inspection and audit
2025-12-04 16:04:21 +08:00

296 lines
10 KiB
Markdown

# Apache AGE: Technical Analysis & LightRAG Implementation Decision
## Executive Summary
Apache AGE (Graph Engine) is a PostgreSQL extension providing graph database capabilities within PostgreSQL. In the LightRAG multi-tenant Docker deployment, AGE support was disabled due to installation complexity in containerized environments, with graceful error handling implemented to prevent startup failures.
## What is Apache AGE?
### Overview
Apache AGE is an extension for PostgreSQL that enables property graph database functionality using the **Cypher query language** (same as Neo4j). It allows PostgreSQL to function as a hybrid relational-graph database.
**Official References:**
- [Apache AGE GitHub Repository](https://github.com/apache/incubator-age)
- [Apache AGE Documentation](https://age.apache.org/)
- [Cypher Query Language Spec](https://s3.amazonaws.com/artifacts.opencypher.org/openCypher9.pdf)
### Key Characteristics
| Aspect | Details |
|--------|---------|
| **Language** | Cypher (borrowed from Neo4j) |
| **Model** | Property Graph (nodes, edges, labels, properties) |
| **Query Syntax** | `SELECT * FROM cypher('graph_name', '...cypher_query...')` |
| **Storage** | Native PostgreSQL tables with AGE schema |
| **License** | Apache 2.0 |
| **Maturity** | Active development (incubating project) |
### Core Functions
```sql
-- Create graph
SELECT create_graph('graph_name');
-- Execute Cypher queries
SELECT * FROM cypher('graph_name', $$
MATCH (n:Label) WHERE n.property = 'value' RETURN n
$$) AS (node agtype);
-- Drop graph
SELECT drop_graph('graph_name', true);
```
## AGE in LightRAG Context
### Usage Pattern
LightRAG uses AGE for **graph storage backend** (`PGGraphStorage` class in `/lightrag/kg/postgres_impl.py`):
1. **Entity-Relation Graph Storage**: Stores knowledge graph entities (nodes) and relationships (edges)
2. **Graph Name**: `chunk_entity_relation` - primary graph for semantic relationships
3. **Node Structure**: Entities with labels (Person, Organization, Location, etc.)
4. **Edge Types**: Semantic relationships between entities
5. **Query Operations**:
- Entity discovery (finding all entities of a type)
- Relationship traversal (finding connected entities)
- Pattern matching (complex graph queries)
### Integration Points
```python
# From postgres_impl.py line 227
await connection.execute(f"select create_graph('{graph_name}')")
# Entity insertion example
# Nodes stored as property graph vertices
# Relations stored as property graph edges
# Cypher queries enable efficient graph traversals
```
### Data Flow
```
Document Input
Entity Extraction (LLM)
AGE Graph Storage
├─ Nodes: Extracted entities
├─ Edges: Entity relationships
└─ Labels: Entity types
Graph Queries (Cypher)
RAG Results (enhanced with graph context)
```
## AGE vs pgVector: Complementary Technologies
### Comparison Table
| Aspect | pgVector | Apache AGE |
|--------|----------|-----------|
| **Purpose** | Vector similarity search | Graph relationships |
| **Data Structure** | Embeddings (float arrays) | Property graphs (nodes/edges) |
| **Query Type** | Similarity/semantic search | Pattern matching/traversal |
| **Algorithm** | HNSW, IVFFlat indices | Graph algorithms |
| **Use Case** | "Find semantically similar content" | "Find connected entities" |
| **LightRAG Role** | Vector retrieval & chunking | Knowledge graph structure |
### Synergistic Usage in LightRAG
```
LightRAG Hybrid Approach:
├─ pgVector: "What documents are semantically similar?"
│ └─ Chunk-level similarity search
├─ AGE Graph: "How are extracted entities related?"
│ └─ Entity relationship mapping
└─ Combined: "Get semantically similar content + its entity context"
```
## Decision: Disabling AGE in Docker Deployment
### Problem Analysis
**Installation Complexity:**
- AGE requires compilation from source within PostgreSQL environment
- Needs PostgreSQL development headers (`postgres.h`)
- Pre-built `pgvector/pgvector:pg15` image lacks AGE compilation toolchain
- Building custom image with both pgvector + AGE adds 200MB+ and significant build time
**Docker Build Attempts:**
1. **Attempt 1**: Used `pgvector/pgvector:pg15-bookworm`
- Error: pgvector extension not found
2. **Attempt 2**: Built custom image with AGE compilation
```dockerfile
RUN git clone https://github.com/apache/incubator-age.git
RUN make PG_CONFIG=/usr/lib/postgresql/15/bin/pg_config
```
- Error: `postgres.h` header files not available in slim base image
- Resolution: Requires full PostgreSQL dev package (substantial image bloat)
### Solution Implemented
**Graceful Degradation Strategy:**
```python
# File: lightrag/kg/postgres_impl.py, line 233
except (
asyncpg.exceptions.UndefinedFunctionError, # AGE not available
asyncpg.exceptions.InvalidSchemaNameError,
asyncpg.exceptions.UniqueViolationError,
):
pass # Silently continue without AGE
```
**Changes Made:**
1. Added `UndefinedFunctionError` exception handling in `configure_age()` method
2. Added exception catching in `execute()` method for AGE-specific SQL
3. System continues startup without graph functionality rather than failing
**Why This Approach:**
- ✅ Minimal image size (no custom PostgreSQL build)
- ✅ Fast deployment (no AGE compilation)
- ✅ Graceful degradation (app doesn't crash)
- ✅ Easy to enable later (reinstall AGE extension, exceptions handled)
- ✅ Development/demo-friendly
## Consequences of AGE Disablement
### Functional Impact
| Feature | Status | Mitigation |
|---------|--------|-----------|
| **Entity relationship queries** | ❌ Unavailable | Use vector similarity + metadata |
| **Graph traversal** | ❌ Disabled | LLM-based relationship inference |
| **Pattern matching** | ❌ Not supported | SQL queries on relationship tables |
| **Knowledge graph visualization** | ⚠️ Degraded | Show only extracted entities, no topology |
| **Complex relationship analysis** | ❌ Limited | Single-hop queries only |
### Performance Implications
**Without AGE:**
- Entity extraction still works (stored in SQL tables)
- Relationship metadata persisted (as JSONB in document status)
- Graph visualization shows entities but not relationships
- Pattern-based queries require application-level logic
**With AGE (if re-enabled):**
- Efficient multi-hop traversals
- Native Cypher query optimization
- Complex pattern matching
- Better knowledge graph visualization
### Recovery Path
To re-enable AGE in existing deployment:
```bash
# 1. Install AGE extension in running PostgreSQL
docker exec lightrag-postgres apt-get install -y postgresql-15-dev build-essential
cd /tmp && git clone https://github.com/apache/incubator-age.git
cd incubator-age && make && make install
# 2. Create extension in database
docker exec lightrag-postgres psql -U lightrag -d lightrag_multitenant \
-c "CREATE EXTENSION age;"
# 3. Update init-postgres.sql to include:
CREATE EXTENSION IF NOT EXISTS "age";
# 4. Restart API container (exception handling already in place)
docker restart lightrag-api
```
## Architectural Implications
### Current Architecture (AGE Disabled)
```
PostgreSQL
├─ PGKVStorage: Key-value metadata
├─ PGVectorStorage: pgVector embeddings ✅ ACTIVE
├─ PGGraphStorage: Entity relationships (SQL fallback)
└─ PGDocStatusStorage: Document processing status
```
### Alternative Architectures
**Option 1: Neo4j Integration** (graph-focused)
```
PostgreSQL Neo4j
├─ pgvector ├─ Full graph DB
├─ Metadata └─ Cypher queries
```
**Option 2: Memgraph Integration** (lightweight graph)
```
PostgreSQL Memgraph
├─ pgvector ├─ Memory-optimized
└─ Metadata └─ Graph queries
```
**Option 3: AGE Re-enabled** (current approach, future)
```
PostgreSQL (All-in-one)
├─ pgvector: embeddings ✅
├─ AGE: graph DB ⏳
└─ Metadata: standard tables ✅
```
## Technical References
### PostgreSQL Graph Extensions Landscape
| Extension | Focus | Maturity | License |
|-----------|-------|----------|---------|
| **AGE** | Cypher graphs | Incubating | Apache 2.0 |
| **PostGIS** | Spatial data | Stable | GPLv2 |
| **pggraph** | General graphs | Archived | MIT |
| **GraphQL** | API layer | Stable | Apache 2.0 |
### Related Documentation
- [PostgreSQL Extension Development](https://www.postgresql.org/docs/15/extend.html)
- [pgVector Documentation](https://github.com/pgvector/pgvector)
- [Property Graph Model (ISO/IEC 39075)](https://www.iso.org/standard/76120.html)
- [OpenCypher Language Reference](https://s3.amazonaws.com/artifacts.opencypher.org/openCypher9.pdf)
## Recommendations
### For Development/Testing
1. **Keep AGE disabled** - faster iteration, smaller images
2. **Use vector-based retrieval** - sufficient for most use cases
3. **Add Neo4j as optional sidecar** - if graph analysis needed
### For Production Deployment
1. **Evaluate AGE vs Neo4j** based on:
- Query complexity requirements
- Scale (nodes/edges count)
- Response time constraints
- Infrastructure overhead tolerance
2. **If AGE needed:**
- Build custom PostgreSQL image with AGE pre-installed
- Use multi-stage builds to minimize final image size
- Cache built layers in registry
3. **If AGE not needed:**
- Current architecture is optimal
- Implement relationship queries in application layer
- Use pgVector for semantic retrieval exclusively
## Summary
AGE provides powerful graph query capabilities but introduces deployment complexity in containerized environments. The decision to disable AGE in LightRAG's Docker deployment prioritizes **simplicity and startup speed** while maintaining **graceful error handling** for future re-enablement. The current architecture relies on pgVector for semantic retrieval and PostgreSQL for entity metadata, which covers the majority of RAG use cases without requiring a dedicated graph database.
---
**Last Updated:** November 20, 2025
**Status:** Implemented & Tested
**Related Files:**
- `lightrag/kg/postgres_impl.py` (exception handling)
- `starter/docker-compose.yml` (deployment config)
- `starter/init-postgres.sql` (schema initialization)