Merge remote-tracking branch 'upstream/main'
This commit is contained in:
commit
bb4d8181d5
7 changed files with 378 additions and 4 deletions
207
.clinerules/01-basic.md
Normal file
207
.clinerules/01-basic.md
Normal file
|
|
@ -0,0 +1,207 @@
|
||||||
|
# LightRAG Project Intelligence (.clinerules)
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
LightRAG is a mature, production-ready Retrieval-Augmented Generation (RAG) system with comprehensive knowledge graph capabilities. The system has evolved from experimental to production-ready status with extensive functionality across all major components.
|
||||||
|
|
||||||
|
## Current System State (August 15, 2025)
|
||||||
|
- **Status**: Production Ready - Stable and Mature
|
||||||
|
- **Configuration**: Gemini 2.5 Flash + BAAI/bge-m3 embeddings via custom endpoints
|
||||||
|
- **Storage**: Default in-memory with file persistence (JsonKVStorage, NetworkXStorage, NanoVectorDBStorage)
|
||||||
|
- **Language**: Chinese for summaries
|
||||||
|
- **Workspace**: `space1` for data isolation
|
||||||
|
- **Authentication**: JWT-based with admin/user accounts
|
||||||
|
|
||||||
|
## Critical Implementation Patterns
|
||||||
|
|
||||||
|
### 1. Embedding Format Compatibility (CRITICAL)
|
||||||
|
**Pattern**: Always handle both base64 and raw array embedding formats
|
||||||
|
**Location**: `lightrag/llm/openai.py` - `openai_embed` function
|
||||||
|
**Issue**: Custom OpenAI-compatible endpoints return embeddings as raw arrays, not base64 strings
|
||||||
|
**Solution**:
|
||||||
|
```python
|
||||||
|
np.array(dp.embedding, dtype=np.float32) if isinstance(dp.embedding, list)
|
||||||
|
else np.frombuffer(base64.b64decode(dp.embedding), dtype=np.float32)
|
||||||
|
```
|
||||||
|
**Impact**: Document processing fails completely without this dual format support
|
||||||
|
|
||||||
|
### 2. Async Pattern Consistency (CRITICAL)
|
||||||
|
**Pattern**: Always await coroutines before calling methods on the result
|
||||||
|
**Common Error**: `coroutine.method()` instead of `(await coroutine).method()`
|
||||||
|
**Locations**: MongoDB implementations, Neo4j operations
|
||||||
|
**Example**: `await self._data.list_indexes()` then `await cursor.to_list()`
|
||||||
|
|
||||||
|
### 3. Storage Layer Data Compatibility (CRITICAL)
|
||||||
|
**Pattern**: Always filter deprecated/incompatible fields during deserialization
|
||||||
|
**Common Fields to Remove**: `content`, `_id` (MongoDB), database-specific fields
|
||||||
|
**Implementation**: `data.pop('field_name', None)` before creating dataclass objects
|
||||||
|
**Locations**: All storage implementations (JSON, Redis, MongoDB, PostgreSQL)
|
||||||
|
|
||||||
|
### 4. Lock Key Generation (CRITICAL)
|
||||||
|
**Pattern**: Always sort relationship pairs for consistent lock keys
|
||||||
|
**Implementation**: `sorted_key_parts = sorted([src, tgt])` then `f"{sorted_key_parts[0]}-{sorted_key_parts[1]}"`
|
||||||
|
**Impact**: Prevents deadlocks in concurrent relationship processing
|
||||||
|
|
||||||
|
### 5. Event Loop Management (CRITICAL)
|
||||||
|
**Pattern**: Handle event loop mismatches during shutdown gracefully
|
||||||
|
**Implementation**: Timeout + specific RuntimeError handling for "attached to a different loop"
|
||||||
|
**Location**: Neo4j storage finalization
|
||||||
|
**Impact**: Prevents application shutdown failures
|
||||||
|
|
||||||
|
## Architecture Patterns
|
||||||
|
|
||||||
|
### 1. Dependency Injection
|
||||||
|
**Pattern**: Pass configuration through object constructors, not direct imports
|
||||||
|
**Example**: OllamaAPI receives configuration through LightRAG object
|
||||||
|
**Benefit**: Better testability and modularity
|
||||||
|
|
||||||
|
### 2. Memory Bank Documentation
|
||||||
|
**Pattern**: Maintain comprehensive memory bank for development continuity
|
||||||
|
**Structure**: Core files (projectbrief.md, activeContext.md, progress.md, etc.)
|
||||||
|
**Purpose**: Essential for context preservation across development sessions
|
||||||
|
|
||||||
|
### 3. Configuration Management
|
||||||
|
**Pattern**: Centralize defaults in constants.py, use environment variables for runtime config
|
||||||
|
**Implementation**: Default values in constants, override via .env file
|
||||||
|
**Benefit**: Consistent configuration across components
|
||||||
|
|
||||||
|
## Development Workflow Patterns
|
||||||
|
|
||||||
|
### 1. Frontend Development (CRITICAL)
|
||||||
|
**Package Manager**: **ALWAYS USE BUN** - Never use npm or yarn unless Bun is unavailable
|
||||||
|
**Commands**:
|
||||||
|
- `bun install` - Install dependencies
|
||||||
|
- `bun run dev` - Start development server
|
||||||
|
- `bun run build` - Build for production
|
||||||
|
- `bun run lint` - Run linting
|
||||||
|
- `bun test` - Run tests
|
||||||
|
- `bun run preview` - Preview production build
|
||||||
|
|
||||||
|
**Pattern**: All frontend operations must use Bun commands
|
||||||
|
**Fallback**: Only use npm/yarn if Bun installation fails
|
||||||
|
**Testing**: Use `bun test` for all frontend testing
|
||||||
|
|
||||||
|
### 2. Bug Fix Approach
|
||||||
|
1. **Identify root cause** - Don't just fix symptoms
|
||||||
|
2. **Implement robust solution** - Handle edge cases and format variations
|
||||||
|
3. **Maintain backward compatibility** - Preserve existing functionality
|
||||||
|
4. **Add comprehensive error handling** - Graceful degradation
|
||||||
|
5. **Document the fix** - Update memory bank with technical details
|
||||||
|
|
||||||
|
### 3. Feature Implementation
|
||||||
|
1. **Follow existing patterns** - Maintain architectural consistency
|
||||||
|
2. **Use dependency injection** - Avoid direct imports between modules
|
||||||
|
3. **Implement comprehensive error handling** - Handle all failure modes
|
||||||
|
4. **Add proper logging** - Debug and warning messages
|
||||||
|
5. **Update documentation** - Memory bank and code comments
|
||||||
|
6. **Comment Language** - Use English for comments and documentation
|
||||||
|
|
||||||
|
### 4. Performance Optimization
|
||||||
|
1. **Profile before optimizing** - Identify actual bottlenecks
|
||||||
|
2. **Maintain algorithmic correctness** - Don't sacrifice functionality for speed
|
||||||
|
3. **Use appropriate data structures** - Match structure to access patterns
|
||||||
|
4. **Implement caching strategically** - Cache expensive operations
|
||||||
|
5. **Monitor memory usage** - Prevent memory leaks
|
||||||
|
|
||||||
|
## Technology Stack Intelligence
|
||||||
|
|
||||||
|
### 1. LLM Integration
|
||||||
|
- **Primary**: Gemini 2.5 Flash via custom endpoint
|
||||||
|
- **Embedding**: BAAI/bge-m3 via custom endpoint
|
||||||
|
- **Reranking**: BAAI/bge-reranker-v2-m3
|
||||||
|
- **Pattern**: Always handle multiple provider formats
|
||||||
|
|
||||||
|
### 2. Storage Backends
|
||||||
|
- **Default**: In-memory with file persistence
|
||||||
|
- **Production Options**: PostgreSQL, MongoDB, Redis, Neo4j
|
||||||
|
- **Pattern**: Abstract storage interface with multiple implementations
|
||||||
|
|
||||||
|
### 3. API Architecture
|
||||||
|
- **Framework**: FastAPI with Gunicorn for production
|
||||||
|
- **Authentication**: JWT-based with role support
|
||||||
|
- **Compatibility**: Ollama-compatible endpoints for easy integration
|
||||||
|
|
||||||
|
### 4. Frontend
|
||||||
|
- **Framework**: React with TypeScript
|
||||||
|
- **Package Manager**: **BUN (REQUIRED)** - Always use Bun for all frontend operations
|
||||||
|
- **Build Tool**: Vite with Bun runtime
|
||||||
|
- **Visualization**: Sigma.js for graph rendering
|
||||||
|
- **State Management**: React hooks with context
|
||||||
|
- **Internationalization**: i18next for multi-language support
|
||||||
|
|
||||||
|
## Common Pitfalls and Solutions
|
||||||
|
|
||||||
|
### 1. Embedding Format Issues
|
||||||
|
**Pitfall**: Assuming all endpoints return base64-encoded embeddings
|
||||||
|
**Solution**: Always check format and handle both base64 and raw arrays
|
||||||
|
|
||||||
|
### 2. Async/Await Patterns
|
||||||
|
**Pitfall**: Calling methods on coroutines instead of awaited results
|
||||||
|
**Solution**: Always await coroutines before accessing their methods
|
||||||
|
|
||||||
|
### 3. Data Model Evolution
|
||||||
|
**Pitfall**: Breaking changes when removing fields from dataclasses
|
||||||
|
**Solution**: Filter deprecated fields during deserialization, don't break storage
|
||||||
|
|
||||||
|
### 4. Concurrency Issues
|
||||||
|
**Pitfall**: Inconsistent lock key generation causing deadlocks
|
||||||
|
**Solution**: Always sort keys for deterministic lock ordering
|
||||||
|
|
||||||
|
### 5. Event Loop Management
|
||||||
|
**Pitfall**: Event loop mismatches during shutdown
|
||||||
|
**Solution**: Implement timeout and specific error handling for loop issues
|
||||||
|
|
||||||
|
## Performance Considerations
|
||||||
|
|
||||||
|
### 1. Query Context Building
|
||||||
|
- **Algorithm**: Linear gradient weighted polling for fair resource allocation
|
||||||
|
- **Optimization**: Round-robin merging to eliminate mode bias
|
||||||
|
- **Pattern**: Smart chunk selection based on cross-entity occurrence
|
||||||
|
|
||||||
|
### 2. Graph Operations
|
||||||
|
- **Optimization**: Batch operations where possible
|
||||||
|
- **Pattern**: Use appropriate indexing for large datasets
|
||||||
|
- **Consideration**: Memory usage with large graphs
|
||||||
|
|
||||||
|
### 3. LLM Request Management
|
||||||
|
- **Pattern**: Priority-based queue for request ordering
|
||||||
|
- **Optimization**: Connection pooling and retry mechanisms
|
||||||
|
- **Consideration**: Rate limiting and cost management
|
||||||
|
|
||||||
|
## Security Patterns
|
||||||
|
|
||||||
|
### 1. Authentication
|
||||||
|
- **Implementation**: JWT tokens with role-based access
|
||||||
|
- **Pattern**: Stateless authentication with configurable expiration
|
||||||
|
- **Security**: Proper token validation and refresh mechanisms
|
||||||
|
|
||||||
|
### 2. API Security
|
||||||
|
- **Pattern**: Input validation and sanitization
|
||||||
|
- **Implementation**: FastAPI dependency injection for auth
|
||||||
|
- **Consideration**: Rate limiting and abuse prevention
|
||||||
|
|
||||||
|
## Maintenance Guidelines
|
||||||
|
|
||||||
|
### 1. Memory Bank Updates
|
||||||
|
- **Trigger**: After significant changes or bug fixes
|
||||||
|
- **Pattern**: Update activeContext.md and progress.md
|
||||||
|
- **Purpose**: Maintain development continuity
|
||||||
|
|
||||||
|
### 2. Configuration Management
|
||||||
|
- **Pattern**: Environment-based configuration with sensible defaults
|
||||||
|
- **Implementation**: .env files with example templates
|
||||||
|
- **Consideration**: Security for production deployments
|
||||||
|
|
||||||
|
### 3. Error Handling
|
||||||
|
- **Pattern**: Comprehensive logging with appropriate levels
|
||||||
|
- **Implementation**: Graceful degradation where possible
|
||||||
|
- **Consideration**: User-friendly error messages
|
||||||
|
|
||||||
|
## Project Evolution Notes
|
||||||
|
|
||||||
|
The project has evolved from experimental to production-ready status. Key milestones:
|
||||||
|
- **Early 2025**: Basic RAG implementation
|
||||||
|
- **Mid 2025**: Multiple storage backends and LLM providers
|
||||||
|
- **July 2025**: Major query optimization and algorithm improvements
|
||||||
|
- **August 2025**: Production-ready stable state
|
||||||
|
|
||||||
|
The system now supports enterprise-level deployments with comprehensive functionality across all components.
|
||||||
1
.gitignore
vendored
1
.gitignore
vendored
|
|
@ -73,4 +73,3 @@ test_*
|
||||||
# Cline files
|
# Cline files
|
||||||
memory-bank
|
memory-bank
|
||||||
memory-bank/
|
memory-bank/
|
||||||
.clinerules
|
|
||||||
|
|
|
||||||
108
Agments.md
Normal file
108
Agments.md
Normal file
|
|
@ -0,0 +1,108 @@
|
||||||
|
# Project Guide for AI Agents
|
||||||
|
|
||||||
|
This Agments.md file provides operational guidance for AI assistants collaborating on the LightRAG codebase. Use it to understand the repository layout, preferred tooling, and expectations for adding or modifying functionality.
|
||||||
|
|
||||||
|
## Core Purpose
|
||||||
|
|
||||||
|
LightRAG is an advanced Retrieval-Augmented Generation (RAG) framework designed to enhance information retrieval and generation through graph-based knowledge representation. The project aims to provide a more intelligent and efficient way to process and retrieve information from documents by leveraging both graph structures and vector embeddings.
|
||||||
|
|
||||||
|
## Project Structure for Navigation
|
||||||
|
|
||||||
|
- `/lightrag`: Core Python package (ingestion, querying, storage abstractions, utilities). Key modules include `lightrag/lightrag.py` orchestration, `operate.py` pipeline helpers, `kg/` storage backends, `llm/` bindings, and `utils*.py`.
|
||||||
|
- `/lightrag/api`: FastAPI with Gunicorn for production. FastAPI service for LightRAG , auth, WebUI assets live in `lightrag_server.py`. Routers live in `routers/`, shared helpers in `utils_api.py`. Gunicorn startup logic lives in `run_with_gunicorn.py`.
|
||||||
|
- `/lightrag_webui`: React 19 + TypeScript + Tailwind front-end built with Vite/Bun. Uses component folders under `src/` and configuration via `env.*.sample`.
|
||||||
|
- `/inputs`, `/rag_storage`, `/dickens`, `/temp`: data directories. Treat contents as mutable working data; avoid committing generated artefacts.
|
||||||
|
- `/tests` and root-level `test_*.py`: Integration and smoke-test scripts (graph storage, API endpoints, behaviour regressions). Many expect specific environment variables or services.
|
||||||
|
- `/docs`, `/k8s-deploy`, `docker-compose.yml`: Deployment notes, Kubernetes manifests, and container orchestration helpers.
|
||||||
|
- Configuration templates: `.env.example`, `config.ini.example`, `lightrag.service.example`. Copy and adapt for local runs without committing secrets.
|
||||||
|
|
||||||
|
## Environment Setup and Tooling
|
||||||
|
|
||||||
|
- Python 3.10 is required. Recommended bootstrap:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Development installation
|
||||||
|
python -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
pip install -e .
|
||||||
|
pip install -e .[api]
|
||||||
|
|
||||||
|
# Start API server
|
||||||
|
lightrag-server
|
||||||
|
|
||||||
|
# Production deployment
|
||||||
|
lightrag-gunicorn --workers 3
|
||||||
|
```
|
||||||
|
|
||||||
|
- Duplicate `.env.example` to `.env` and adjust storage, LLM, and reranker bindings. Mirror `config.ini.example` when customising pipeline defaults.
|
||||||
|
- Storage backends (PostgreSQL, Redis, Neo4j, Milvus, etc.) are selected via `LIGHTRAG_*` environment variables. Ensure connection URLs and credentials are in place before running ingestion or tests.
|
||||||
|
- CLI entry points: `python -m lightrag` for package usage, `lightrag-server` (or `uvicorn lightrag.api.lightrag_server:app --reload`) for the API, `lightrag-gunicorn` for production gunicorn runs.
|
||||||
|
- Front-end work: install dependencies with `bun install` (preferred) or `npm install`, then use `bunx --bun vite` commands defined in `package.json`.
|
||||||
|
|
||||||
|
## Frontend Development
|
||||||
|
|
||||||
|
* **Package Manager**: **ALWAYS USE BUN** - Never use npm or yarn unless Bun is unavailable
|
||||||
|
**Commands**:
|
||||||
|
|
||||||
|
- `bun install` - Install dependencies
|
||||||
|
|
||||||
|
- `bun run dev` - Start development server
|
||||||
|
|
||||||
|
- `bun run build` - Build for production
|
||||||
|
|
||||||
|
- `bun run lint` - Run linting
|
||||||
|
|
||||||
|
- `bun test` - Run tests
|
||||||
|
|
||||||
|
- `bun run preview` - Preview production build
|
||||||
|
|
||||||
|
|
||||||
|
* **Pattern**: All frontend operations must use Bun commands
|
||||||
|
* **Testing**: Use `bun test` for all frontend testing
|
||||||
|
|
||||||
|
## Coding Conventions
|
||||||
|
|
||||||
|
- Embrace type hints, dataclasses, and asynchronous patterns already present in `lightrag/lightrag.py` and storage implementations. Keep long-running jobs within `asyncio` flows and reuse helpers from `lightrag.operate`.
|
||||||
|
- Honour abstraction boundaries: new storage providers should inherit from the relevant base classes in `lightrag.base`; reusable logic belongs in `utils.py`/`utils_graph.py`.
|
||||||
|
- Use `lightrag.utils.logger` (not bare `print`) and let environment toggles (`VERBOSE`, `LOG_LEVEL`) control verbosity.
|
||||||
|
- Respect configuration defaults in `lightrag/constants.py`, extending with care and synchronising related documentation when behaviour changes.
|
||||||
|
- API additions should live under `lightrag/api/routers`, leverage dependency injections from `utils_api.py`, and return structured responses consistent with existing handlers.
|
||||||
|
- Front-end code should remain in TypeScript, rely on functional React components with hooks, and follow Tailwind utility style. Co-locate component-specific styles; reserve custom CSS for cases Tailwind cannot cover.
|
||||||
|
- Storage Backends
|
||||||
|
- **Default**: In-memory with file persistence
|
||||||
|
- **Production Options**: PostgreSQL, MongoDB, Redis, Neo4j
|
||||||
|
- **Pattern**: Abstract storage interface with multiple implementations
|
||||||
|
|
||||||
|
* Lock Key Generation Consistency
|
||||||
|
- **Critical Pattern**: Always sort parameters for lock key generation to prevent deadlocks
|
||||||
|
- **Example**: `sorted_key_parts = sorted([src, tgt])` before creating lock key
|
||||||
|
- **Why**: Prevents different lock keys for same relationship pair processed in different orders
|
||||||
|
- **Apply to**: Any function that uses locks with multiple parameters
|
||||||
|
* Priority Queue Implementation
|
||||||
|
* **Pattern**: Use priority-based task queuing for LLM requests
|
||||||
|
* **Benefits**: Critical operations get higher priority
|
||||||
|
* **Implementation**: Lower priority values = higher priority
|
||||||
|
|
||||||
|
## Testing and Quality Gates
|
||||||
|
|
||||||
|
- Run Python tests with `python -m pytest tests` for the FastAPI suite, and execute targeted scripts (for example `python tests/test_graph_storage.py`, `python test_lock_fix.py`) when touching related functionality. Many scripts require running backing services; check `.env` for prerequisites.
|
||||||
|
- Perform linting via `ruff check .` (configured in `pyproject.toml`) and address warnings. For formatting, match the existing style rather than introducing new tools.
|
||||||
|
- Front-end validation: `bun test`, `bunx --bun vite build`, and `bunx --bun vite lint`. The `*-no-bun` scripts exist if Bun is unavailable.
|
||||||
|
- When touching deployment assets, ensure `docker-compose config` or relevant `kubectl` dry-runs succeed before submitting changes.
|
||||||
|
|
||||||
|
## Runtime and Operational Notes
|
||||||
|
|
||||||
|
- Knowledge ingestion expects documents inside `inputs/` and writes intermediate state to `rag_storage/`. Keep these directories gitignored; never check in private data or large artefacts.
|
||||||
|
- Use `operate.py` helpers (e.g., `chunking_by_token_size`, `extract_entities`) to keep ingestion behaviour consistent. If extending the pipeline, document new steps in `docs/` and update any affected CLI usage.
|
||||||
|
- The API and core package rely on `.env`/`config.ini` being co-located with the current working directory. Scripts such as `tests/test_graph_storage.py` dynamically read these files; ensure they are in sync.
|
||||||
|
|
||||||
|
## Contribution Checklist
|
||||||
|
|
||||||
|
1. Run `pre-commit run --all-files` before sumitting PR.
|
||||||
|
2. Describe the change, affected modules, and operational impact in your PR. Mention any new environment knobs or storage dependencies.
|
||||||
|
3. Link related issues or discussions when available.
|
||||||
|
4. Confirm all applicable checks pass (`ruff`, pytest suite, targeted integration scripts, front-end build/tests when touched).
|
||||||
|
5. Capture screenshots or GIFs for front-end or API changes that affect user-visible behaviour.
|
||||||
|
6. Keep each PR focused on a single concern and update documentation (`README.md`, `docs/`, `.env.example`) when behaviour or configuration changes.
|
||||||
|
|
||||||
|
Follow this playbook to keep LightRAG contributions predictable, testable, and production-ready.
|
||||||
|
|
@ -310,6 +310,14 @@ POSTGRES_IVFFLAT_LISTS=100
|
||||||
# POSTGRES_SSL_ROOT_CERT=/path/to/ca-cert.pem
|
# POSTGRES_SSL_ROOT_CERT=/path/to/ca-cert.pem
|
||||||
# POSTGRES_SSL_CRL=/path/to/crl.pem
|
# POSTGRES_SSL_CRL=/path/to/crl.pem
|
||||||
|
|
||||||
|
### PostgreSQL Server Settings (for Supabase Supavisor)
|
||||||
|
# Use this to pass extra options to the PostgreSQL connection string.
|
||||||
|
# For Supabase, you might need to set it like this:
|
||||||
|
# POSTGRES_SERVER_SETTINGS="options=reference%3D[project-ref]"
|
||||||
|
|
||||||
|
# Default is 100 set to 0 to disable
|
||||||
|
# POSTGRES_STATEMENT_CACHE_SIZE=100
|
||||||
|
|
||||||
### Neo4j Configuration
|
### Neo4j Configuration
|
||||||
NEO4J_URI=neo4j+s://xxxxxxxx.databases.neo4j.io
|
NEO4J_URI=neo4j+s://xxxxxxxx.databases.neo4j.io
|
||||||
NEO4J_USERNAME=neo4j
|
NEO4J_USERNAME=neo4j
|
||||||
|
|
|
||||||
|
|
@ -65,7 +65,7 @@ class NetworkXStorage(BaseGraphStorage):
|
||||||
)
|
)
|
||||||
else:
|
else:
|
||||||
logger.info(
|
logger.info(
|
||||||
f"[{self.workspace}] Created new empty graph fiel: {self._graphml_xml_file}"
|
f"[{self.workspace}] Created new empty graph file: {self._graphml_xml_file}"
|
||||||
)
|
)
|
||||||
self._graph = preloaded_graph or nx.Graph()
|
self._graph = preloaded_graph or nx.Graph()
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -79,6 +79,12 @@ class PostgreSQLDB:
|
||||||
self.hnsw_ef = config.get("hnsw_ef")
|
self.hnsw_ef = config.get("hnsw_ef")
|
||||||
self.ivfflat_lists = config.get("ivfflat_lists")
|
self.ivfflat_lists = config.get("ivfflat_lists")
|
||||||
|
|
||||||
|
# Server settings
|
||||||
|
self.server_settings = config.get("server_settings")
|
||||||
|
|
||||||
|
# Statement LRU cache size (keep as-is, allow None for optional configuration)
|
||||||
|
self.statement_cache_size = config.get("statement_cache_size")
|
||||||
|
|
||||||
if self.user is None or self.password is None or self.database is None:
|
if self.user is None or self.password is None or self.database is None:
|
||||||
raise ValueError("Missing database user, password, or database")
|
raise ValueError("Missing database user, password, or database")
|
||||||
|
|
||||||
|
|
@ -165,6 +171,15 @@ class PostgreSQLDB:
|
||||||
"max_size": self.max,
|
"max_size": self.max,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# Only add statement_cache_size if it's configured
|
||||||
|
if self.statement_cache_size is not None:
|
||||||
|
connection_params["statement_cache_size"] = int(
|
||||||
|
self.statement_cache_size
|
||||||
|
)
|
||||||
|
logger.info(
|
||||||
|
f"PostgreSQL, statement LRU cache size set as: {self.statement_cache_size}"
|
||||||
|
)
|
||||||
|
|
||||||
# Add SSL configuration if provided
|
# Add SSL configuration if provided
|
||||||
ssl_context = self._create_ssl_context()
|
ssl_context = self._create_ssl_context()
|
||||||
if ssl_context is not None:
|
if ssl_context is not None:
|
||||||
|
|
@ -178,6 +193,24 @@ class PostgreSQLDB:
|
||||||
connection_params["ssl"] = False
|
connection_params["ssl"] = False
|
||||||
logger.info(f"PostgreSQL, SSL mode set to: {self.ssl_mode}")
|
logger.info(f"PostgreSQL, SSL mode set to: {self.ssl_mode}")
|
||||||
|
|
||||||
|
# Add server settings if provided
|
||||||
|
if self.server_settings:
|
||||||
|
try:
|
||||||
|
settings = {}
|
||||||
|
# The format is expected to be a query string, e.g., "key1=value1&key2=value2"
|
||||||
|
pairs = self.server_settings.split("&")
|
||||||
|
for pair in pairs:
|
||||||
|
if "=" in pair:
|
||||||
|
key, value = pair.split("=", 1)
|
||||||
|
settings[key] = value
|
||||||
|
if settings:
|
||||||
|
connection_params["server_settings"] = settings
|
||||||
|
logger.info(f"PostgreSQL, Server settings applied: {settings}")
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning(
|
||||||
|
f"PostgreSQL, Failed to parse server_settings: {self.server_settings}, error: {e}"
|
||||||
|
)
|
||||||
|
|
||||||
self.pool = await asyncpg.create_pool(**connection_params) # type: ignore
|
self.pool = await asyncpg.create_pool(**connection_params) # type: ignore
|
||||||
|
|
||||||
# Ensure VECTOR extension is available
|
# Ensure VECTOR extension is available
|
||||||
|
|
@ -833,8 +866,8 @@ class PostgreSQLDB:
|
||||||
|
|
||||||
# Execute the migration
|
# Execute the migration
|
||||||
alter_sql = f"""
|
alter_sql = f"""
|
||||||
ALTER TABLE {migration['table']}
|
ALTER TABLE {migration["table"]}
|
||||||
ALTER COLUMN {migration['column']} TYPE {migration['new_type']}
|
ALTER COLUMN {migration["column"]} TYPE {migration["new_type"]}
|
||||||
"""
|
"""
|
||||||
|
|
||||||
await self.execute(alter_sql)
|
await self.execute(alter_sql)
|
||||||
|
|
@ -1464,6 +1497,15 @@ class ClientManager:
|
||||||
config.get("postgres", "ivfflat_lists", fallback="100"),
|
config.get("postgres", "ivfflat_lists", fallback="100"),
|
||||||
)
|
)
|
||||||
),
|
),
|
||||||
|
# Server settings for Supabase
|
||||||
|
"server_settings": os.environ.get(
|
||||||
|
"POSTGRES_SERVER_SETTINGS",
|
||||||
|
config.get("postgres", "server_options", fallback=None),
|
||||||
|
),
|
||||||
|
"statement_cache_size": os.environ.get(
|
||||||
|
"POSTGRES_STATEMENT_CACHE_SIZE",
|
||||||
|
config.get("postgres", "statement_cache_size", fallback=None),
|
||||||
|
),
|
||||||
}
|
}
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
|
|
|
||||||
|
|
@ -579,6 +579,7 @@ async def openai_embed(
|
||||||
base_url: str | None = None,
|
base_url: str | None = None,
|
||||||
api_key: str | None = None,
|
api_key: str | None = None,
|
||||||
client_configs: dict[str, Any] | None = None,
|
client_configs: dict[str, Any] | None = None,
|
||||||
|
token_tracker: Any | None = None,
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""Generate embeddings for a list of texts using OpenAI's API.
|
"""Generate embeddings for a list of texts using OpenAI's API.
|
||||||
|
|
||||||
|
|
@ -590,6 +591,7 @@ async def openai_embed(
|
||||||
client_configs: Additional configuration options for the AsyncOpenAI client.
|
client_configs: Additional configuration options for the AsyncOpenAI client.
|
||||||
These will override any default configurations but will be overridden by
|
These will override any default configurations but will be overridden by
|
||||||
explicit parameters (api_key, base_url).
|
explicit parameters (api_key, base_url).
|
||||||
|
token_tracker: Optional token usage tracker for monitoring API usage.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
A numpy array of embeddings, one per input text.
|
A numpy array of embeddings, one per input text.
|
||||||
|
|
@ -608,6 +610,14 @@ async def openai_embed(
|
||||||
response = await openai_async_client.embeddings.create(
|
response = await openai_async_client.embeddings.create(
|
||||||
model=model, input=texts, encoding_format="base64"
|
model=model, input=texts, encoding_format="base64"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
if token_tracker and hasattr(response, "usage"):
|
||||||
|
token_counts = {
|
||||||
|
"prompt_tokens": getattr(response.usage, "prompt_tokens", 0),
|
||||||
|
"total_tokens": getattr(response.usage, "total_tokens", 0),
|
||||||
|
}
|
||||||
|
token_tracker.add_usage(token_counts)
|
||||||
|
|
||||||
return np.array(
|
return np.array(
|
||||||
[
|
[
|
||||||
np.array(dp.embedding, dtype=np.float32)
|
np.array(dp.embedding, dtype=np.float32)
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue