style: fix lint errors (trailing whitespace and formatting)

This commit is contained in:
BukeLy 2025-11-20 01:41:23 +08:00
parent 65ff9b32bd
commit 6bef40766d
8 changed files with 671 additions and 231 deletions

376
MIGRATION_ANALYSIS.md Normal file
View file

@ -0,0 +1,376 @@
# Vector Model Isolation - 迁移场景覆盖分析
## 执行日期
2025-11-20
## 关键发现
### ⚠️ 严重问题Qdrant Legacy 命名不兼容
#### 问题描述
**旧版本main分支的Qdrant命名**
```python
# Get legacy namespace for data migration from old version
if effective_workspace:
self.legacy_namespace = f"{effective_workspace}_{self.namespace}"
else:
self.legacy_namespace = self.namespace
self.final_namespace = f"lightrag_vdb_{self.namespace}"
```
示例:
- workspace="my_workspace", namespace="chunks"
- legacy_namespace = "my_workspace_chunks"
- final_namespace = "lightrag_vdb_chunks"
**新版本feature分支的Qdrant命名**
```python
# Legacy collection name (without model suffix, for migration)
self.legacy_namespace = f"lightrag_vdb_{self.namespace}"
# New naming scheme with model isolation
self.final_namespace = f"lightrag_vdb_{self.namespace}_{model_suffix}"
```
示例:
- workspace="my_workspace", namespace="chunks"
- legacy_namespace = "lightrag_vdb_chunks" ❌ 与旧版不匹配!
- final_namespace = "lightrag_vdb_chunks_text_embedding_ada_002_1536d"
#### 影响分析
1. **从旧版本升级时的迁移失败**
- 旧版本用户的collection名称可能是`my_workspace_chunks``chunks`
- 新版本尝试从 `lightrag_vdb_chunks` 迁移
- 结果找不到legacy collection无法自动迁移
2. **数据丢失风险**
- 用户升级后可能看不到旧数据
- 需要手动迁移数据
### ✅ PostgreSQL 迁移逻辑正确
PostgreSQL的迁移逻辑比较清晰
**旧版本**
- 表名直接使用 `lightrag_vdb_chunks` 等固定名称
**新版本**
- legacy_table_name = `lightrag_vdb_chunks`
- table_name = `lightrag_vdb_chunks_{model}_{dim}d`
这个逻辑是正确的因为旧版PostgreSQL就是使用固定表名。
---
## 测试覆盖情况分析
### 当前E2E测试覆盖的场景
| 测试名称 | 数据库 | 测试场景 | 覆盖Case |
|---------|--------|---------|---------|
| `test_legacy_migration_postgres` | PostgreSQL | 从legacy表迁移 | Case 4: Legacy→New |
| `test_legacy_migration_qdrant` | Qdrant | 从legacy collection迁移 | Case 4: Legacy→New |
| `test_multi_instance_postgres` | PostgreSQL | 多模型共存 | Case 3: 创建新表 |
| `test_multi_instance_qdrant` | Qdrant | 多模型共存 | Case 3: 创建新collection |
### 缺失的测试场景
#### 未覆盖的Case
1. ❌ **Case 1: 新旧共存警告**
- 场景legacy和new都存在
- 预期:只输出警告,不迁移
- 状态:未测试
2. ❌ **Case 2: 已迁移场景**
- 场景只有new存在legacy已删除
- 预期:检查索引,正常使用
- 状态:未测试
3. ❌ **从真实旧版本升级**
- 场景用户从LightRAG旧版本升级
- Qdrant: legacy名称是 `{workspace}_{namespace}``{namespace}`
- 预期:能正确识别并迁移
- 状态:**未覆盖,存在兼容性问题!**
#### 未覆盖的边界情况
1. ❌ **空数据迁移**
- 场景legacy存在但为空
- 预期:跳过迁移,创建新表/collection
- 状态:代码有逻辑,但未测试
2. ❌ **迁移失败回滚**
- 场景:迁移过程中断
- 预期:抛出异常,数据一致性保证
- 状态:未测试
3. ❌ **Workspace隔离验证**
- 场景同一collection/table内多个workspace
- 预期:数据完全隔离
- 状态:未明确测试
4. ❌ **模型切换场景**
- 场景用户切换embedding模型
- 预期:创建新表/collection旧数据保留
- 状态:未测试
---
## 向后兼容性分析
### ✅ PostgreSQL - 完全兼容
- 旧版本表名:`lightrag_vdb_chunks`
- 新版本识别:`legacy_table_name = "lightrag_vdb_chunks"`
- 结论:**完全兼容**
### ❌ Qdrant - 不兼容!
#### 兼容性问题详情
**场景1使用workspace的旧版用户**
```python
# 旧版本 (main)
workspace = "prod"
legacy_namespace = "prod_chunks" # 旧版生成的名称
final_namespace = "lightrag_vdb_chunks"
# 新版本 (feature)
legacy_namespace = "lightrag_vdb_chunks" # 新版期望的legacy名称
final_namespace = "lightrag_vdb_chunks_text_embedding_ada_002_1536d"
# 结果:找不到 "prod_chunks" collection迁移失败
```
**场景2不使用workspace的旧版用户**
```python
# 旧版本 (main)
workspace = None
legacy_namespace = "chunks" # 旧版生成的名称
final_namespace = "lightrag_vdb_chunks"
# 新版本 (feature)
legacy_namespace = "lightrag_vdb_chunks" # 新版期望的legacy名称
final_namespace = "lightrag_vdb_chunks_text_embedding_ada_002_1536d"
# 结果:找不到 "chunks" collection迁移失败
```
#### 影响范围
1. **所有使用workspace的Qdrant用户** - 升级后数据无法访问
2. **所有不使用workspace的Qdrant用户** - 升级后数据无法访问
3. **仅有旧版本使用 `lightrag_vdb_{namespace}` 作为collection名的用户不受影响**
---
## 代码风格一致性检查
### ✅ 整体代码风格
1. **迁移逻辑模式统一**
- PostgreSQL和Qdrant使用相同的4-Case逻辑
- 两者都有 `setup_table/setup_collection` 静态方法
- ✅ 一致性良好
2. **命名规范**
- 都使用 `legacy_*``final_*` / `table_name` 命名
- 都使用 `model_suffix` 生成逻辑
- ✅ 一致性良好
3. **日志格式**
- 都使用相同的日志格式和级别
- 都输出清晰的迁移进度
- ✅ 一致性良好
4. **错误处理**
- 都定义了专门的迁移异常类
- 都有迁移验证逻辑
- ✅ 一致性良好
5. **批处理大小**
- PostgreSQL: 500条/批
- Qdrant: 500条/批
- ✅ 一致性良好
### ⚠️ 需要改进的地方
1. **注释风格不统一**
- 部分使用中文注释
- 部分使用英文注释
- 建议:统一为英文
2. **测试命名**
- 部分测试有中文docstring
- 建议:保持中英双语
---
## 建议修复方案
### 1. 修复Qdrant兼容性问题
#### 方案A支持多种legacy命名模式推荐
```python
async def _find_legacy_collection(
client: QdrantClient,
workspace: str,
namespace: str
) -> str | None:
"""
Try to find legacy collection with various naming patterns
for backward compatibility.
Returns:
Collection name if found, None otherwise
"""
# Pattern 1: New legacy format (from previous feature branch)
candidate1 = f"lightrag_vdb_{namespace}"
# Pattern 2: Old format with workspace
candidate2 = f"{workspace}_{namespace}" if workspace else None
# Pattern 3: Old format without workspace
candidate3 = namespace
# Try each pattern
for candidate in [candidate1, candidate2, candidate3]:
if candidate and client.collection_exists(candidate):
logger.info(f"Found legacy collection: {candidate}")
return candidate
return None
```
然后在`setup_collection`中使用:
```python
# Find legacy collection with backward compatibility
legacy_collection = await _find_legacy_collection(
client, workspace, namespace
)
legacy_exists = legacy_collection is not None
# Case 4: Only legacy exists - Migrate data
if legacy_exists and not new_collection_exists:
logger.info(
f"Qdrant: Migrating data from legacy collection '{legacy_collection}'"
)
# ... 迁移逻辑使用 legacy_collection
```
#### 方案B文档化手动迁移步骤
如果不想支持自动识别,至少要提供清晰的手动迁移文档。
### 2. 补充缺失的测试
#### 高优先级测试
```python
@pytest.mark.asyncio
async def test_qdrant_legacy_workspace_migration():
"""Test migration from old workspace-based naming"""
# 创建旧格式collection: "workspace_chunks"
# 验证新代码能识别并迁移
pass
@pytest.mark.asyncio
async def test_case1_both_exist_warning():
"""Test Case 1: Both legacy and new exist"""
# 验证只输出警告,不迁移
pass
@pytest.mark.asyncio
async def test_case2_only_new_exists():
"""Test Case 2: Only new table/collection exists"""
# 验证跳过迁移,检查索引
pass
@pytest.mark.asyncio
async def test_empty_legacy_migration():
"""Test migration when legacy is empty"""
# 验证跳过数据迁移,只创建新表/collection
pass
@pytest.mark.asyncio
async def test_workspace_isolation():
"""Test workspace isolation within same collection/table"""
# 验证不同workspace的数据完全隔离
pass
```
#### 中等优先级测试
```python
@pytest.mark.asyncio
async def test_model_switch_scenario():
"""Test switching embedding models"""
# 验证切换模型后创建新表/collection
pass
@pytest.mark.asyncio
async def test_migration_failure_handling():
"""Test migration error handling"""
# 验证迁移失败时的异常处理
pass
```
### 3. 改进文档
需要在Migration Guide中明确说明
1. **Qdrant用户的特殊注意事项**
2. **如何手动迁移旧collection**
3. **升级前的备份建议**
4. **验证迁移成功的步骤**
---
## 总结
### 关键问题
1. ❌ **Qdrant向后兼容性严重问题** - 必须修复!
2. ❌ **测试覆盖不足** - 缺少关键场景测试
3. ✅ **PostgreSQL迁移逻辑正确**
4. ✅ **代码风格基本一致**
### 建议优先级
1. **P0 - 立即修复**
- 修复Qdrant向后兼容性问题
- 添加兼容性测试
2. **P1 - PR合并前**
- 补充Case 1、Case 2测试
- 添加workspace隔离测试
- 更新Migration Guide文档
3. **P2 - 后续改进**
- 补充边界情况测试
- 统一注释语言
- 添加更详细的错误信息
### 风险评估
- **不修复Qdrant兼容性**: 🔴 高风险 - 用户升级后数据丢失
- **测试覆盖不足**: 🟡 中风险 - 生产环境可能出现未预期的问题
- **文档不完整**: 🟡 中风险 - 用户不知道如何正确升级
---
## 下一步行动
1. 与用户确认是否接受方案A推荐或方案B
2. 实施选定的修复方案
3. 补充关键测试
4. 更新文档
5. 重新运行所有E2E测试
6. 准备发布

View file

@ -15,11 +15,10 @@ Requirements:
- Qdrant or PostgreSQL for vector storage (optional, defaults to NanoVectorDB)
"""
import os
import asyncio
from lightrag import LightRAG, QueryParam
from lightrag.llm.openai import gpt_4o_mini_complete, openai_embed
from lightrag.utils import EmbeddingFunc, logger
from lightrag.utils import EmbeddingFunc
# Set your API key
# os.environ["OPENAI_API_KEY"] = "your-api-key-here"
@ -33,22 +32,19 @@ async def scenario_1_new_workspace_with_explicit_model():
- Qdrant: lightrag_vdb_chunks_text_embedding_3_large_3072d
- PostgreSQL: LIGHTRAG_VDB_CHUNKS_text_embedding_3_large_3072d
"""
print("\n" + "="*80)
print("\n" + "=" * 80)
print("Scenario 1: New Workspace with Explicit Model Name")
print("="*80)
print("=" * 80)
# Define custom embedding function with explicit model name
async def my_embedding_func(texts: list[str]):
return await openai_embed(
texts,
model="text-embedding-3-large"
)
return await openai_embed(texts, model="text-embedding-3-large")
# Create EmbeddingFunc with model_name specified
embedding_func = EmbeddingFunc(
embedding_dim=3072,
func=my_embedding_func,
model_name="text-embedding-3-large" # Explicit model name
model_name="text-embedding-3-large", # Explicit model name
)
rag = LightRAG(
@ -64,8 +60,7 @@ async def scenario_1_new_workspace_with_explicit_model():
# Query
result = await rag.aquery(
"What does LightRAG support?",
param=QueryParam(mode="hybrid")
"What does LightRAG support?", param=QueryParam(mode="hybrid")
)
print(f"\nQuery Result: {result[:200]}...")
@ -85,9 +80,9 @@ async def scenario_2_legacy_migration():
- Old: lightrag_vdb_chunks (no suffix)
- New: lightrag_vdb_chunks_text_embedding_ada_002_1536d (with suffix)
"""
print("\n" + "="*80)
print("\n" + "=" * 80)
print("Scenario 2: Automatic Migration from Legacy Format")
print("="*80)
print("=" * 80)
# Step 1: Simulate legacy workspace (no model_name)
print("\n[Step 1] Creating legacy workspace without model_name...")
@ -98,7 +93,7 @@ async def scenario_2_legacy_migration():
# Legacy: No model_name specified
legacy_embedding = EmbeddingFunc(
embedding_dim=1536,
func=legacy_embedding_func
func=legacy_embedding_func,
# model_name not specified → uses "unknown" as fallback
)
@ -121,7 +116,7 @@ async def scenario_2_legacy_migration():
new_embedding = EmbeddingFunc(
embedding_dim=1536,
func=legacy_embedding_func,
model_name="text-embedding-ada-002" # Now explicitly specified
model_name="text-embedding-ada-002", # Now explicitly specified
)
rag_new = LightRAG(
@ -138,8 +133,7 @@ async def scenario_2_legacy_migration():
# Verify data is still accessible
result = await rag_new.aquery(
"What is the legacy data?",
param=QueryParam(mode="hybrid")
"What is the legacy data?", param=QueryParam(mode="hybrid")
)
print(f"\nQuery Result: {result[:200] if result else 'No results'}...")
@ -160,9 +154,9 @@ async def scenario_3_multiple_models_coexistence():
- Workspace A: lightrag_vdb_chunks_bge_small_768d
- Workspace B: lightrag_vdb_chunks_bge_large_1024d
"""
print("\n" + "="*80)
print("\n" + "=" * 80)
print("Scenario 3: Multiple Models Coexistence")
print("="*80)
print("=" * 80)
# Workspace A: Small embedding model (768 dimensions)
print("\n[Workspace A] Using bge-small model (768d)...")
@ -175,7 +169,7 @@ async def scenario_3_multiple_models_coexistence():
embedding_a = EmbeddingFunc(
embedding_dim=1536, # text-embedding-3-small dimension
func=embedding_func_small,
model_name="text-embedding-3-small"
model_name="text-embedding-3-small",
)
rag_a = LightRAG(
@ -199,7 +193,7 @@ async def scenario_3_multiple_models_coexistence():
embedding_b = EmbeddingFunc(
embedding_dim=3072, # text-embedding-3-large dimension
func=embedding_func_large,
model_name="text-embedding-3-large"
model_name="text-embedding-3-large",
)
rag_b = LightRAG(
@ -217,12 +211,10 @@ async def scenario_3_multiple_models_coexistence():
print("\n[Verification] Querying both workspaces...")
result_a = await rag_a.aquery(
"What model does workspace use?",
param=QueryParam(mode="hybrid")
"What model does workspace use?", param=QueryParam(mode="hybrid")
)
result_b = await rag_b.aquery(
"What model does workspace use?",
param=QueryParam(mode="hybrid")
"What model does workspace use?", param=QueryParam(mode="hybrid")
)
print(f"\nWorkspace A Result: {result_a[:100] if result_a else 'No results'}...")
@ -238,9 +230,9 @@ async def main():
"""
Run all scenarios to demonstrate model isolation features
"""
print("\n" + "="*80)
print("\n" + "=" * 80)
print("LightRAG Multi-Model Vector Storage Isolation Demo")
print("="*80)
print("=" * 80)
print("\nThis demo shows how LightRAG automatically handles:")
print("1. ✅ Automatic model suffix generation")
print("2. ✅ Seamless data migration from legacy format")
@ -256,9 +248,9 @@ async def main():
# Scenario 3: Multiple models coexistence
await scenario_3_multiple_models_coexistence()
print("\n" + "="*80)
print("\n" + "=" * 80)
print("✅ All scenarios completed successfully!")
print("="*80)
print("=" * 80)
print("\n📝 Key Takeaways:")
print("- Always specify `model_name` in EmbeddingFunc for clear model tracking")
@ -271,6 +263,7 @@ async def main():
except Exception as e:
print(f"\n❌ Error: {e}")
import traceback
traceback.print_exc()

View file

@ -66,6 +66,45 @@ def workspace_filter_condition(workspace: str) -> models.FieldCondition:
)
def _find_legacy_collection(
client: QdrantClient, namespace: str, workspace: str = None
) -> str | None:
"""
Find legacy collection with backward compatibility support.
This function tries multiple naming patterns to locate legacy collections
created by older versions of LightRAG:
1. lightrag_vdb_{namespace} - Current legacy format
2. {workspace}_{namespace} - Old format with workspace (pre-model-isolation)
3. {namespace} - Old format without workspace (pre-model-isolation)
Args:
client: QdrantClient instance
namespace: Base namespace (e.g., "chunks", "entities")
workspace: Optional workspace identifier
Returns:
Collection name if found, None otherwise
"""
# Try multiple naming patterns for backward compatibility
candidates = [
f"lightrag_vdb_{namespace}", # New legacy format
f"{workspace}_{namespace}" if workspace else None, # Old format with workspace
namespace, # Old format without workspace
]
for candidate in candidates:
if candidate and client.collection_exists(candidate):
logger.info(
f"Qdrant: Found legacy collection '{candidate}' "
f"(namespace={namespace}, workspace={workspace or 'none'})"
)
return candidate
return None
@final
@dataclass
class QdrantVectorDBStorage(BaseVectorStorage):
@ -85,27 +124,37 @@ class QdrantVectorDBStorage(BaseVectorStorage):
def setup_collection(
client: QdrantClient,
collection_name: str,
legacy_namespace: str = None,
namespace: str = None,
workspace: str = None,
**kwargs,
):
"""
Setup Qdrant collection with migration support from legacy collections.
This method now supports backward compatibility by automatically detecting
legacy collections created by older versions of LightRAG using multiple
naming patterns.
Args:
client: QdrantClient instance
collection_name: Name of the new collection
legacy_namespace: Name of the legacy collection (if exists)
namespace: Base namespace (e.g., "chunks", "entities")
workspace: Workspace identifier for data isolation
**kwargs: Additional arguments for collection creation (vectors_config, hnsw_config, etc.)
"""
new_collection_exists = client.collection_exists(collection_name)
legacy_exists = legacy_namespace and client.collection_exists(legacy_namespace)
# Try to find legacy collection with backward compatibility
legacy_collection = (
_find_legacy_collection(client, namespace, workspace) if namespace else None
)
legacy_exists = legacy_collection is not None
# Case 1: Both new and legacy collections exist - Warning only (no migration)
if new_collection_exists and legacy_exists:
logger.warning(
f"Qdrant: Legacy collection '{legacy_namespace}' still exist. Remove it if migration is complete."
f"Qdrant: Legacy collection '{legacy_collection}' still exists. "
f"Remove it if migration is complete."
)
return
@ -149,13 +198,13 @@ class QdrantVectorDBStorage(BaseVectorStorage):
# Case 4: Only legacy exists - Migrate data
logger.info(
f"Qdrant: Migrating data from legacy collection '{legacy_namespace}'"
f"Qdrant: Migrating data from legacy collection '{legacy_collection}'"
)
try:
# Get legacy collection count
legacy_count = client.count(
collection_name=legacy_namespace, exact=True
collection_name=legacy_collection, exact=True
).count
logger.info(f"Qdrant: Found {legacy_count} records in legacy collection")
@ -185,7 +234,7 @@ class QdrantVectorDBStorage(BaseVectorStorage):
while True:
# Scroll through legacy data
result = client.scroll(
collection_name=legacy_namespace,
collection_name=legacy_collection,
limit=batch_size,
offset=offset,
with_vectors=True,
@ -258,7 +307,7 @@ class QdrantVectorDBStorage(BaseVectorStorage):
),
)
logger.info(
f"Qdrant: Migration from '{legacy_namespace}' to '{collection_name}' completed successfully"
f"Qdrant: Migration from '{legacy_collection}' to '{collection_name}' completed successfully"
)
except QdrantMigrationError:
@ -350,11 +399,11 @@ class QdrantVectorDBStorage(BaseVectorStorage):
)
# Setup collection (create if not exists and configure indexes)
# Pass legacy_namespace and workspace for migration support
# Pass namespace and workspace for backward-compatible migration support
QdrantVectorDBStorage.setup_collection(
self._client,
self.final_namespace,
legacy_namespace=self.legacy_namespace,
namespace=self.namespace,
workspace=self.effective_workspace,
vectors_config=models.VectorParams(
size=self.embedding_func.embedding_dim,

View file

@ -176,7 +176,7 @@ class UnifiedLock(Generic[T]):
enable_output=self._enable_logging,
)
else:
direct_log(
direct_log(
f"== Lock == Process {self._pid}: Main lock {self._name} is None (async={self._is_async})",
level="WARNING",
enable_output=self._enable_logging,

View file

@ -2,33 +2,54 @@ import pytest
from lightrag.base import BaseVectorStorage
from lightrag.utils import EmbeddingFunc
def test_base_vector_storage_integrity():
# Just checking if we can import and inspect the class
assert hasattr(BaseVectorStorage, '_generate_collection_suffix')
assert hasattr(BaseVectorStorage, '_get_legacy_collection_name')
assert hasattr(BaseVectorStorage, '_get_new_collection_name')
assert hasattr(BaseVectorStorage, "_generate_collection_suffix")
assert hasattr(BaseVectorStorage, "_get_legacy_collection_name")
assert hasattr(BaseVectorStorage, "_get_new_collection_name")
# Verify methods raise NotImplementedError
class ConcreteStorage(BaseVectorStorage):
async def query(self, *args, **kwargs): pass
async def upsert(self, *args, **kwargs): pass
async def delete_entity(self, *args, **kwargs): pass
async def delete_entity_relation(self, *args, **kwargs): pass
async def get_by_id(self, *args, **kwargs): pass
async def get_by_ids(self, *args, **kwargs): pass
async def delete(self, *args, **kwargs): pass
async def get_vectors_by_ids(self, *args, **kwargs): pass
async def index_done_callback(self): pass
async def drop(self): pass
async def query(self, *args, **kwargs):
pass
async def upsert(self, *args, **kwargs):
pass
async def delete_entity(self, *args, **kwargs):
pass
async def delete_entity_relation(self, *args, **kwargs):
pass
async def get_by_id(self, *args, **kwargs):
pass
async def get_by_ids(self, *args, **kwargs):
pass
async def delete(self, *args, **kwargs):
pass
async def get_vectors_by_ids(self, *args, **kwargs):
pass
async def index_done_callback(self):
pass
async def drop(self):
pass
func = EmbeddingFunc(embedding_dim=128, func=lambda x: x)
storage = ConcreteStorage(namespace="test", workspace="test", global_config={}, embedding_func=func)
storage = ConcreteStorage(
namespace="test", workspace="test", global_config={}, embedding_func=func
)
assert storage._generate_collection_suffix() == "unknown_128d"
with pytest.raises(NotImplementedError):
storage._get_legacy_collection_name()
with pytest.raises(NotImplementedError):
storage._get_new_collection_name()

View file

@ -1,37 +1,31 @@
import pytest
from lightrag.utils import EmbeddingFunc
def dummy_func(*args, **kwargs):
pass
def test_embedding_func_with_model_name():
func = EmbeddingFunc(
embedding_dim=1536,
func=dummy_func,
model_name="text-embedding-ada-002"
embedding_dim=1536, func=dummy_func, model_name="text-embedding-ada-002"
)
assert func.get_model_identifier() == "text_embedding_ada_002_1536d"
def test_embedding_func_without_model_name():
func = EmbeddingFunc(
embedding_dim=768,
func=dummy_func
)
func = EmbeddingFunc(embedding_dim=768, func=dummy_func)
assert func.get_model_identifier() == "unknown_768d"
def test_model_name_sanitization():
func = EmbeddingFunc(
embedding_dim=1024,
func=dummy_func,
model_name="models/text-embedding-004" # Contains special chars
model_name="models/text-embedding-004", # Contains special chars
)
assert func.get_model_identifier() == "models_text_embedding_004_1024d"
def test_model_name_with_uppercase():
func = EmbeddingFunc(
embedding_dim=512,
func=dummy_func,
model_name="My-Model-V1"
)
assert func.get_model_identifier() == "my_model_v1_512d"
def test_model_name_with_uppercase():
func = EmbeddingFunc(embedding_dim=512, func=dummy_func, model_name="My-Model-V1")
assert func.get_model_identifier() == "my_model_v1_512d"

View file

@ -1,13 +1,9 @@
import os
import pytest
from unittest.mock import MagicMock, patch, AsyncMock, call
from unittest.mock import patch, AsyncMock
import numpy as np
from lightrag.utils import EmbeddingFunc
from lightrag.kg.postgres_impl import (
PGVectorStorage,
_pg_table_exists,
_pg_create_table,
PostgreSQLMigrationError,
)
from lightrag.namespace import NameSpace
@ -56,29 +52,25 @@ def mock_embedding_func():
async def embed_func(texts, **kwargs):
return np.array([[0.1] * 768 for _ in texts])
func = EmbeddingFunc(
embedding_dim=768,
func=embed_func,
model_name="test_model"
)
func = EmbeddingFunc(embedding_dim=768, func=embed_func, model_name="test_model")
return func
@pytest.mark.asyncio
async def test_postgres_table_naming(mock_client_manager, mock_pg_db, mock_embedding_func):
async def test_postgres_table_naming(
mock_client_manager, mock_pg_db, mock_embedding_func
):
"""Test if table name is correctly generated with model suffix"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
storage = PGVectorStorage(
namespace=NameSpace.VECTOR_STORE_CHUNKS,
global_config=config,
embedding_func=mock_embedding_func,
workspace="test_ws"
workspace="test_ws",
)
# Verify table name contains model suffix
@ -91,20 +83,20 @@ async def test_postgres_table_naming(mock_client_manager, mock_pg_db, mock_embed
@pytest.mark.asyncio
async def test_postgres_migration_trigger(mock_client_manager, mock_pg_db, mock_embedding_func):
async def test_postgres_migration_trigger(
mock_client_manager, mock_pg_db, mock_embedding_func
):
"""Test if migration logic is triggered correctly"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
storage = PGVectorStorage(
namespace=NameSpace.VECTOR_STORE_CHUNKS,
global_config=config,
embedding_func=mock_embedding_func,
workspace="test_ws"
workspace="test_ws",
)
# Setup mocks for migration scenario
@ -132,9 +124,12 @@ async def test_postgres_migration_trigger(mock_client_manager, mock_pg_db, mock_
mock_pg_db.query = AsyncMock(side_effect=mock_query)
with patch("lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists), \
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()):
with (
patch(
"lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists
),
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()),
):
# Initialize storage (should trigger migration)
await storage.initialize()
@ -144,29 +139,32 @@ async def test_postgres_migration_trigger(mock_client_manager, mock_pg_db, mock_
@pytest.mark.asyncio
async def test_postgres_no_migration_needed(mock_client_manager, mock_pg_db, mock_embedding_func):
async def test_postgres_no_migration_needed(
mock_client_manager, mock_pg_db, mock_embedding_func
):
"""Test scenario where new table already exists (no migration needed)"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
storage = PGVectorStorage(
namespace=NameSpace.VECTOR_STORE_CHUNKS,
global_config=config,
embedding_func=mock_embedding_func,
workspace="test_ws"
workspace="test_ws",
)
# Mock: new table already exists
async def mock_table_exists(db, table_name):
return table_name == storage.table_name
with patch("lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists), \
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()) as mock_create:
with (
patch(
"lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists
),
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()) as mock_create,
):
await storage.initialize()
# Verify no table creation was attempted
@ -174,7 +172,9 @@ async def test_postgres_no_migration_needed(mock_client_manager, mock_pg_db, moc
@pytest.mark.asyncio
async def test_scenario_1_new_workspace_creation(mock_client_manager, mock_pg_db, mock_embedding_func):
async def test_scenario_1_new_workspace_creation(
mock_client_manager, mock_pg_db, mock_embedding_func
):
"""
Scenario 1: New workspace creation
@ -185,31 +185,32 @@ async def test_scenario_1_new_workspace_creation(mock_client_manager, mock_pg_db
"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
embedding_func = EmbeddingFunc(
embedding_dim=3072,
func=mock_embedding_func.func,
model_name="text-embedding-3-large"
model_name="text-embedding-3-large",
)
storage = PGVectorStorage(
namespace=NameSpace.VECTOR_STORE_CHUNKS,
global_config=config,
embedding_func=embedding_func,
workspace="new_workspace"
workspace="new_workspace",
)
# Mock: neither table exists
async def mock_table_exists(db, table_name):
return False
with patch("lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists), \
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()) as mock_create:
with (
patch(
"lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists
),
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()) as mock_create,
):
await storage.initialize()
# Verify table name format
@ -218,11 +219,15 @@ async def test_scenario_1_new_workspace_creation(mock_client_manager, mock_pg_db
# Verify new table creation was called
mock_create.assert_called_once()
call_args = mock_create.call_args
assert call_args[0][1] == storage.table_name # table_name is second positional arg
assert (
call_args[0][1] == storage.table_name
) # table_name is second positional arg
@pytest.mark.asyncio
async def test_scenario_2_legacy_upgrade_migration(mock_client_manager, mock_pg_db, mock_embedding_func):
async def test_scenario_2_legacy_upgrade_migration(
mock_client_manager, mock_pg_db, mock_embedding_func
):
"""
Scenario 2: Upgrade from legacy version
@ -233,22 +238,20 @@ async def test_scenario_2_legacy_upgrade_migration(mock_client_manager, mock_pg_
"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
embedding_func = EmbeddingFunc(
embedding_dim=1536,
func=mock_embedding_func.func,
model_name="text-embedding-ada-002"
model_name="text-embedding-ada-002",
)
storage = PGVectorStorage(
namespace=NameSpace.VECTOR_STORE_CHUNKS,
global_config=config,
embedding_func=embedding_func,
workspace="legacy_workspace"
workspace="legacy_workspace",
)
# Mock: only legacy table exists
@ -257,7 +260,11 @@ async def test_scenario_2_legacy_upgrade_migration(mock_client_manager, mock_pg_
# Mock: legacy table has 50 records
mock_rows = [
{"id": f"legacy_id_{i}", "content": f"legacy_content_{i}", "workspace": "legacy_workspace"}
{
"id": f"legacy_id_{i}",
"content": f"legacy_content_{i}",
"workspace": "legacy_workspace",
}
for i in range(50)
]
@ -279,9 +286,12 @@ async def test_scenario_2_legacy_upgrade_migration(mock_client_manager, mock_pg_
mock_pg_db.query = AsyncMock(side_effect=mock_query)
with patch("lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists), \
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()) as mock_create:
with (
patch(
"lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists
),
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()) as mock_create,
):
await storage.initialize()
# Verify table name contains ada-002
@ -293,7 +303,9 @@ async def test_scenario_2_legacy_upgrade_migration(mock_client_manager, mock_pg_
@pytest.mark.asyncio
async def test_scenario_3_multi_model_coexistence(mock_client_manager, mock_pg_db, mock_embedding_func):
async def test_scenario_3_multi_model_coexistence(
mock_client_manager, mock_pg_db, mock_embedding_func
):
"""
Scenario 3: Multiple embedding models coexist
@ -304,23 +316,19 @@ async def test_scenario_3_multi_model_coexistence(mock_client_manager, mock_pg_d
"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
# Workspace A: uses bge-small (768d)
embedding_func_a = EmbeddingFunc(
embedding_dim=768,
func=mock_embedding_func.func,
model_name="bge-small"
embedding_dim=768, func=mock_embedding_func.func, model_name="bge-small"
)
storage_a = PGVectorStorage(
namespace=NameSpace.VECTOR_STORE_CHUNKS,
global_config=config,
embedding_func=embedding_func_a,
workspace="workspace_a"
workspace="workspace_a",
)
# Workspace B: uses bge-large (1024d)
@ -328,16 +336,14 @@ async def test_scenario_3_multi_model_coexistence(mock_client_manager, mock_pg_d
return np.array([[0.1] * 1024 for _ in texts])
embedding_func_b = EmbeddingFunc(
embedding_dim=1024,
func=embed_func_b,
model_name="bge-large"
embedding_dim=1024, func=embed_func_b, model_name="bge-large"
)
storage_b = PGVectorStorage(
namespace=NameSpace.VECTOR_STORE_CHUNKS,
global_config=config,
embedding_func=embedding_func_b,
workspace="workspace_b"
workspace="workspace_b",
)
# Verify different table names
@ -349,9 +355,12 @@ async def test_scenario_3_multi_model_coexistence(mock_client_manager, mock_pg_d
async def mock_table_exists(db, table_name):
return False
with patch("lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists), \
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()) as mock_create:
with (
patch(
"lightrag.kg.postgres_impl._pg_table_exists", side_effect=mock_table_exists
),
patch("lightrag.kg.postgres_impl._pg_create_table", AsyncMock()) as mock_create,
):
# Initialize both storages
await storage_a.initialize()
await storage_b.initialize()

View file

@ -1,9 +1,9 @@
import os
import pytest
from unittest.mock import MagicMock, patch, AsyncMock, call
from unittest.mock import MagicMock, patch, AsyncMock
import numpy as np
from lightrag.utils import EmbeddingFunc
from lightrag.kg.qdrant_impl import QdrantVectorDBStorage, compute_mdhash_id_for_qdrant
from lightrag.kg.qdrant_impl import QdrantVectorDBStorage
# Mock QdrantClient
@pytest.fixture
@ -18,6 +18,7 @@ def mock_qdrant_client():
client.get_collection.return_value = collection_info
yield client
# Mock get_data_init_lock to avoid async lock issues in tests
@pytest.fixture(autouse=True)
def mock_data_init_lock():
@ -26,36 +27,32 @@ def mock_data_init_lock():
mock_lock.return_value = mock_lock_ctx
yield mock_lock
# Mock Embedding function
@pytest.fixture
def mock_embedding_func():
async def embed_func(texts, **kwargs):
return np.array([[0.1] * 768 for _ in texts])
func = EmbeddingFunc(
embedding_dim=768,
func=embed_func,
model_name="test-model"
)
func = EmbeddingFunc(embedding_dim=768, func=embed_func, model_name="test-model")
return func
@pytest.mark.asyncio
async def test_qdrant_collection_naming(mock_qdrant_client, mock_embedding_func):
"""Test if collection name is correctly generated with model suffix"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
storage = QdrantVectorDBStorage(
namespace="chunks",
global_config=config,
embedding_func=mock_embedding_func,
workspace="test_ws"
workspace="test_ws",
)
# Verify collection name contains model suffix
expected_suffix = "test_model_768d"
assert expected_suffix in storage.final_namespace
@ -64,99 +61,97 @@ async def test_qdrant_collection_naming(mock_qdrant_client, mock_embedding_func)
# Verify legacy namespace (should not include workspace, just the base collection name)
assert storage.legacy_namespace == "lightrag_vdb_chunks"
@pytest.mark.asyncio
async def test_qdrant_migration_trigger(mock_qdrant_client, mock_embedding_func):
"""Test if migration logic is triggered correctly"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
storage = QdrantVectorDBStorage(
namespace="chunks",
global_config=config,
embedding_func=mock_embedding_func,
workspace="test_ws"
workspace="test_ws",
)
# Setup mocks for migration scenario
# 1. New collection does not exist
mock_qdrant_client.collection_exists.side_effect = lambda name: name == storage.legacy_namespace
mock_qdrant_client.collection_exists.side_effect = (
lambda name: name == storage.legacy_namespace
)
# 2. Legacy collection exists and has data
mock_qdrant_client.count.return_value.count = 100
# 3. Mock scroll for data migration
from qdrant_client import models
mock_point = MagicMock()
mock_point.id = "old_id"
mock_point.vector = [0.1] * 768
mock_point.payload = {"content": "test"}
# First call returns points, second call returns empty (end of scroll)
mock_qdrant_client.scroll.side_effect = [
([mock_point], "next_offset"),
([], None)
]
mock_qdrant_client.scroll.side_effect = [([mock_point], "next_offset"), ([], None)]
# Initialize storage (triggers migration)
await storage.initialize()
# Verify migration steps
# 1. Legacy count checked
mock_qdrant_client.count.assert_any_call(
collection_name=storage.legacy_namespace,
exact=True
collection_name=storage.legacy_namespace, exact=True
)
# 2. New collection created
mock_qdrant_client.create_collection.assert_called()
# 3. Data scrolled from legacy
assert mock_qdrant_client.scroll.call_count >= 1
call_args = mock_qdrant_client.scroll.call_args_list[0]
assert call_args.kwargs['collection_name'] == storage.legacy_namespace
assert call_args.kwargs['limit'] == 500
assert call_args.kwargs["collection_name"] == storage.legacy_namespace
assert call_args.kwargs["limit"] == 500
# 4. Data upserted to new
mock_qdrant_client.upsert.assert_called()
# 5. Payload index created
mock_qdrant_client.create_payload_index.assert_called()
@pytest.mark.asyncio
async def test_qdrant_no_migration_needed(mock_qdrant_client, mock_embedding_func):
"""Test scenario where new collection already exists"""
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
storage = QdrantVectorDBStorage(
namespace="chunks",
global_config=config,
embedding_func=mock_embedding_func,
workspace="test_ws"
workspace="test_ws",
)
# New collection exists and Legacy exists (warning case)
# or New collection exists and Legacy does not exist (normal case)
# Mocking case where both exist to test logic flow but without migration
# Logic in code:
# Logic in code:
# Case 1: Both exist -> Warning only
# Case 2: Only new exists -> Ensure index
# Let's test Case 2: Only new collection exists
mock_qdrant_client.collection_exists.side_effect = lambda name: name == storage.final_namespace
mock_qdrant_client.collection_exists.side_effect = (
lambda name: name == storage.final_namespace
)
# Initialize
await storage.initialize()
# Should check index but NOT migrate
# In Qdrant implementation, Case 2 calls get_collection
mock_qdrant_client.get_collection.assert_called_with(storage.final_namespace)
@ -167,8 +162,11 @@ async def test_qdrant_no_migration_needed(mock_qdrant_client, mock_embedding_fun
# Tests for scenarios described in design document (Lines 606-649)
# ============================================================================
@pytest.mark.asyncio
async def test_scenario_1_new_workspace_creation(mock_qdrant_client, mock_embedding_func):
async def test_scenario_1_new_workspace_creation(
mock_qdrant_client, mock_embedding_func
):
"""
场景1新建workspace
预期直接创建lightrag_vdb_chunks_text_embedding_3_large_3072d
@ -177,21 +175,19 @@ async def test_scenario_1_new_workspace_creation(mock_qdrant_client, mock_embedd
large_model_func = EmbeddingFunc(
embedding_dim=3072,
func=mock_embedding_func.func,
model_name="text-embedding-3-large"
model_name="text-embedding-3-large",
)
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
storage = QdrantVectorDBStorage(
namespace="chunks",
global_config=config,
embedding_func=large_model_func,
workspace="test_new"
workspace="test_new",
)
# Case 3: Neither legacy nor new collection exists
@ -205,18 +201,27 @@ async def test_scenario_1_new_workspace_creation(mock_qdrant_client, mock_embedd
assert storage.final_namespace == expected_collection
# Verify create_collection was called with correct name
create_calls = [call for call in mock_qdrant_client.create_collection.call_args_list]
create_calls = [
call for call in mock_qdrant_client.create_collection.call_args_list
]
assert len(create_calls) > 0
assert create_calls[0][0][0] == expected_collection or create_calls[0].kwargs.get('collection_name') == expected_collection
assert (
create_calls[0][0][0] == expected_collection
or create_calls[0].kwargs.get("collection_name") == expected_collection
)
# Verify no migration was attempted
mock_qdrant_client.scroll.assert_not_called()
print(f"✅ Scenario 1: New workspace created with collection '{expected_collection}'")
print(
f"✅ Scenario 1: New workspace created with collection '{expected_collection}'"
)
@pytest.mark.asyncio
async def test_scenario_2_legacy_upgrade_migration(mock_qdrant_client, mock_embedding_func):
async def test_scenario_2_legacy_upgrade_migration(
mock_qdrant_client, mock_embedding_func
):
"""
场景2从旧版本升级
已存在lightrag_vdb_chunks无后缀
@ -226,34 +231,34 @@ async def test_scenario_2_legacy_upgrade_migration(mock_qdrant_client, mock_embe
ada_func = EmbeddingFunc(
embedding_dim=1536,
func=mock_embedding_func.func,
model_name="text-embedding-ada-002"
model_name="text-embedding-ada-002",
)
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
storage = QdrantVectorDBStorage(
namespace="chunks",
global_config=config,
embedding_func=ada_func,
workspace="test_legacy"
workspace="test_legacy",
)
legacy_collection = storage.legacy_namespace
new_collection = storage.final_namespace
# Case 4: Only legacy collection exists
mock_qdrant_client.collection_exists.side_effect = lambda name: name == legacy_collection
mock_qdrant_client.collection_exists.side_effect = (
lambda name: name == legacy_collection
)
# Mock legacy data
mock_qdrant_client.count.return_value.count = 150
# Mock scroll results (simulate migration in batches)
from qdrant_client import models
mock_points = []
for i in range(10):
point = MagicMock()
@ -263,10 +268,7 @@ async def test_scenario_2_legacy_upgrade_migration(mock_qdrant_client, mock_embe
mock_points.append(point)
# First batch returns points, second batch returns empty
mock_qdrant_client.scroll.side_effect = [
(mock_points, "offset1"),
([], None)
]
mock_qdrant_client.scroll.side_effect = [(mock_points, "offset1"), ([], None)]
# Initialize (triggers migration)
await storage.initialize()
@ -278,8 +280,7 @@ async def test_scenario_2_legacy_upgrade_migration(mock_qdrant_client, mock_embe
# Verify migration steps
# 1. Check legacy count
mock_qdrant_client.count.assert_any_call(
collection_name=legacy_collection,
exact=True
collection_name=legacy_collection, exact=True
)
# 2. Create new collection
@ -288,14 +289,16 @@ async def test_scenario_2_legacy_upgrade_migration(mock_qdrant_client, mock_embe
# 3. Scroll legacy data
scroll_calls = [call for call in mock_qdrant_client.scroll.call_args_list]
assert len(scroll_calls) >= 1
assert scroll_calls[0].kwargs['collection_name'] == legacy_collection
assert scroll_calls[0].kwargs["collection_name"] == legacy_collection
# 4. Upsert to new collection
upsert_calls = [call for call in mock_qdrant_client.upsert.call_args_list]
assert len(upsert_calls) >= 1
assert upsert_calls[0].kwargs['collection_name'] == new_collection
assert upsert_calls[0].kwargs["collection_name"] == new_collection
print(f"✅ Scenario 2: Legacy data migrated from '{legacy_collection}' to '{expected_new_collection}'")
print(
f"✅ Scenario 2: Legacy data migrated from '{legacy_collection}' to '{expected_new_collection}'"
)
@pytest.mark.asyncio
@ -304,14 +307,13 @@ async def test_scenario_3_multi_model_coexistence(mock_qdrant_client):
场景3多模型并存
预期两个独立的collection互不干扰
"""
# Model A: bge-small with 768d
async def embed_func_a(texts, **kwargs):
return np.array([[0.1] * 768 for _ in texts])
model_a_func = EmbeddingFunc(
embedding_dim=768,
func=embed_func_a,
model_name="bge-small"
embedding_dim=768, func=embed_func_a, model_name="bge-small"
)
# Model B: bge-large with 1024d
@ -319,16 +321,12 @@ async def test_scenario_3_multi_model_coexistence(mock_qdrant_client):
return np.array([[0.2] * 1024 for _ in texts])
model_b_func = EmbeddingFunc(
embedding_dim=1024,
func=embed_func_b,
model_name="bge-large"
embedding_dim=1024, func=embed_func_b, model_name="bge-large"
)
config = {
"embedding_batch_num": 10,
"vector_db_storage_cls_kwargs": {
"cosine_better_than_threshold": 0.8
}
"vector_db_storage_cls_kwargs": {"cosine_better_than_threshold": 0.8},
}
# Create storage for workspace A with model A
@ -336,7 +334,7 @@ async def test_scenario_3_multi_model_coexistence(mock_qdrant_client):
namespace="chunks",
global_config=config,
embedding_func=model_a_func,
workspace="workspace_a"
workspace="workspace_a",
)
# Create storage for workspace B with model B
@ -344,7 +342,7 @@ async def test_scenario_3_multi_model_coexistence(mock_qdrant_client):
namespace="chunks",
global_config=config,
embedding_func=model_b_func,
workspace="workspace_b"
workspace="workspace_b",
)
# Verify: Collection names are different
@ -362,7 +360,7 @@ async def test_scenario_3_multi_model_coexistence(mock_qdrant_client):
assert storage_a.embedding_func.embedding_dim == 768
assert storage_b.embedding_func.embedding_dim == 1024
print(f"✅ Scenario 3: Multi-model coexistence verified")
print("✅ Scenario 3: Multi-model coexistence verified")
print(f" - Workspace A: {expected_collection_a} (768d)")
print(f" - Workspace B: {expected_collection_b} (1024d)")
print(f" - Collections are independent")
print(" - Collections are independent")