- Documented major changes after pulling from upstream (HKUDS/LightRAG), focusing on multi-tenant support, security hardening, and RLS/RBAC. - Created concise documentation under docs/diff_hku, including migration guides and security audits. - Enumerated unmerged upstream commits and summarized substantive features and fixes. - Outlined next steps for DB migrations, CI tests, and potential cherry-picking of upstream fixes.
2 KiB
Top-level summary (1 page)
Scope: compare HEAD (this copy / local branch) against upstream HKUDS/LightRAG upstream/main.
What changed — short bullet map:
- Multi-tenant framework introduced across the stack (API dependencies, middleware, models, service layer, tenant-aware RAG manager, DB helper scripts).
- Security hardening (strict multi-tenant mode, role checks, validations, postgres RLS script, explicit permission checks).
- New fastapi dependencies, tenant/membership APIs, and many documentation and e2e test additions.
- Behavioral and defaults changes: LLM & embedding bindings defaults changed (ollama -> openai), parsing improvements for test frameworks, auth logging.
- Large public asset removal from webui bundle — indicates a release tidy-up or rework of front-end assets.
High-level impact (quick):
-
Functionality: This version adds full multi-tenant support (tenant + KB lifecycle + RBAC) and tenant-scoped RAG instances.
-
Security: Adds critical fixes (RLS, permission checks, tenant isolation). Good progress — but requires DB migrations to be present and tests that ensure RLS + function-based access validation are in place.
-
Backwards compatibility: The code attempts compatibility (optional modes, fallback legacy flows) but DEFAULTS changed (LLM binding), and some endpoints remain public — review before production.
Recommended immediate priorities:
-
Add/verify DB migrations for has_tenant_access, user_tenant_memberships, tenants, knowledge_bases; test RLS policies on a staging DB before production. (High)
-
Rotate default secrets and remove the documented default token secret — ensure environment variables are required or fail-safe. (High)
-
Add automated multi-tenant isolation tests that run against Postgres with RLS enabled. (High)
-
Confirm non-tenant endpoints remain safe (public listing endpoints, login flows) and log audit events. (Medium)
Next: see technical_diffs.md for file-level notes, security_audit.md for prioritized security items, and migration_guide.md for concrete deploy steps.