Commit graph

15 commits

Author SHA1 Message Date
BukeLy
ad68624d02 feat: PostgreSQL model isolation and auto-migration
Why this change is needed:
PostgreSQL vector storage needs model isolation to prevent dimension
conflicts when different workspaces use different embedding models.
Without this, the first workspace locks the vector dimension for all
subsequent workspaces, causing failures.

How it solves it:
- Implements dynamic table naming with model suffix: {table}_{model}_{dim}d
- Adds setup_table() method mirroring Qdrant's approach for consistency
- Implements 4-branch migration logic: both exist -> warn, only new -> use,
  neither -> create, only legacy -> migrate
- Batch migration: 500 records/batch (same as Qdrant)
- No automatic rollback to support idempotent re-runs

Impact:
- PostgreSQL tables now isolated by embedding model and dimension
- Automatic data migration from legacy tables on startup
- Backward compatible: model_name=None defaults to "unknown"
- All SQL operations use dynamic table names

Testing:
- 6 new tests for PostgreSQL migration (100% pass)
- Tests cover: naming, migration trigger, scenarios 1-3
- 3 additional scenario tests added for Qdrant completeness

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 22:54:37 +08:00
yangdx
5da82bb096 Add pre-commit to pytest dependencies and format test code
• Add pre-commit to pytest extra deps
• Update lock file dependencies
2025-11-18 00:42:04 +08:00
yangdx
b7b8d15632 Refactor pytest dependencies into separate optional group
- Extract pytest deps to own group
- Reference pytest group in evaluation
- Add pytest config to pyproject.toml
- Update uv.lock with new structure
2025-11-17 23:52:13 +08:00
yangdx
c246eff725 Improve docling integration with macOS compatibility and CLI flag
- Add --docling CLI flag for easier setup
- Add numpy version constraints
- Exclude docling on macOS (fork-safety)
2025-11-17 12:54:32 +08:00
yangdx
fa9206d69a Update uv.lock 2025-11-17 12:54:32 +08:00
yangdx
c434879c7a Replace PyPDF2 with pypdf for PDF processing
- Update import from PyPDF2 to pypdf
- Change dependency to pypdf>=6.1.0
- Update all requirements files
- Remove PyPDF2 from lock file
- Use modern pypdf library
2025-11-17 12:54:32 +08:00
yangdx
e8f5f57ec7 Update qdrant-client minimum version from 1.7.0 to 1.11.0
• Bump qdrant-client to >=1.11.0
• Update pyproject.toml dependency
• Update requirements files
• Sync uv.lock with new version
• Maintain <2.0.0 upper bound
2025-11-10 11:54:48 +08:00
yangdx
1334b3d896 Update uv.lock 2025-11-09 02:32:30 +08:00
yangdx
c46c1b26a9 Add pycryptodome dependency for PDF encryption support 2025-10-31 01:49:42 +08:00
yangdx
783e2f3b1f Update uv.lock 2025-10-30 11:18:10 +08:00
Anush008
8584980e3a
refactor: Qdrant Multi-tenancy (Include staged)
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-10-26 09:58:24 +05:30
yangdx
1642710fd4 Remove dotenv dependency from project 2025-10-17 15:09:29 +08:00
yangdx
c61b7bd4f8 Remove torch and transformers from offline dependency groups 2025-10-16 15:14:25 +08:00
yangdx
f2b6a068e3 Remove docling dependency and related packages from project
* Remove docling from pyproject.toml
* Update requirements files
* Clean up uv.lock dependencies
* Reduce offline docker image size
2025-10-16 05:15:29 +08:00
yangdx
466de2070d Migrate from pip to uv package manager for faster builds
• Replace pip with uv in Dockerfile
• Remove constraints-offline.txt
• Add uv.lock for dependency pinning
• Use uv sync --frozen for builds
2025-10-16 01:21:03 +08:00