diff --git a/docs/diff_hku/index.md b/docs/diff_hku/index.md new file mode 100644 index 00000000..9b99fb0d --- /dev/null +++ b/docs/diff_hku/index.md @@ -0,0 +1,13 @@ +# HKU diff audit — this version vs upstream HKUDS/main + +This folder contains a focused, concise multi-document audit comparing the current repo state (local branch) against the original upstream HKUDS/LightRAG `upstream/main`. + +Files included: + +- summary.md — one-page top-level summary of the most important behavioral, security, and compatibility changes +- technical_diffs.md — file-by-file highlights and rationale (engineer-friendly) +- security_audit.md — prioritized security observations and fixes +- migration_guide.md — precise steps to deploy safely (DB migrations, env flags, testing) +- tests_needed.md — minimal E2E/unit tests required before release + +Use these to quickly evaluate risk and plan next work. diff --git a/docs/diff_hku/migration_guide.md b/docs/diff_hku/migration_guide.md new file mode 100644 index 00000000..928cbd3a --- /dev/null +++ b/docs/diff_hku/migration_guide.md @@ -0,0 +1,62 @@ + +# Migration & deployment guide — required steps (concise) + +This document lists the EXACT deployment steps to move from upstream/main (HKUDS) to this multi-tenant version. + +Prerequisites + +- Back up database and take a snapshot. (Always.) + +- Ensure staging environment that mirrors production with Postgres + identical config. + +Database migrations (required) + +1) Create the following core objects/tables if not present: + + - tenants (tenant_id PK, name, description, metadata jsonb, created_at, updated_at) + + - knowledge_bases (kb_id, tenant_id FK, name, description, created_at, updated_at) + + - user_tenant_memberships (id, user_id, tenant_id, role, created_at, created_by, updated_at) + +2) Create function has_tenant_access(user_id text, tenant_id text, required_role text) RETURNS boolean. (This function is used by TenantService.verify_user_access). Implement role hierarchy checks and super-admin bypass. + +3) Install RLS policies from lightrag/kg/postgres_rls.sql and confirm session variable set_config('app.current_tenant', tenant_id, false) is used by application before running queries. + +Application config + +- Required environment variables (ensure no defaults used in production): + + - TOKEN_SECRET + + - LIGHTRAG_MULTI_TENANT_STRICT (recommended true) + + - LIGHTRAG_REQUIRE_USER_AUTH (recommended true) + + - LIGHTRAG_SUPER_ADMIN_USERS (explicit comma separated admins) + + - LIGHTRAG_API_KEY (if API key flows used) + +Runtime wiring + +- Ensure database client sets session var on every connection or at least at the start of each request. Example (asyncpg): + + connection.execute("SELECT set_config('app.current_tenant', $1, false)", tenant_id) + +- Middleware: TenantMiddleware will set request.state.tenant_id from token/subdomain — ensure it’s added to the FastAPI middleware list before routers. + +- Inject TenantService and TenantRAGManager into app.state or DI container so dependencies.py can access rag_manager. The code expects request.app.state.rag_manager/tenant_service. + +Testing (staging) — essential checks before rollout + +1) RLS check: create two tenants, insert a doc under tenant A, set session var to tenant B, verify SELECT does not return tenant A rows. + +2) Token vs subdomain: test mismatch handling (token tenant != subdomain tenant) — behavior: dependencies.py raises 400 Tenant ID mismatch. + +3) Membership: test add/update/remove membership flows and verify has_tenant_access returns expected results. + +4) Eviction: test TenantRAGManager cache eviction under concurrent load. + +Rollback plan + +- If any test fails either disable multi-tenant features (set LIGHTRAG_MULTI_TENANT_STRICT=false and run in single-tenant fallback) or restore DB snapshot and rollback application to previous tag. diff --git a/docs/diff_hku/security_audit.md b/docs/diff_hku/security_audit.md new file mode 100644 index 00000000..786e657a --- /dev/null +++ b/docs/diff_hku/security_audit.md @@ -0,0 +1,38 @@ + +# Security audit — prioritized, concise (1 page) + +Priority mapping: P0 (must fix before release) | P1 (fix ASAP) | P2 (recommended) + +P0 — Critical blockers + +- DB migrations & session: The RLS policies (lightrag/kg/postgres_rls.sql) require DB session variable set for EVERY DB connection. Without application instrumentation to set app.current_tenant per request/connection, RLS will either not be applied (if not set) or break queries. Add DB migrations to create membership and access-check functions (has_tenant_access, has_tenant_membership) used by tenant_service. Test RLS end-to-end before release. + +- Secrets: token_secret default is 'lightrag-jwt-default-secret' — this is insecure for production. Require production deployments to set TOKEN_SECRET env var or fail-start. Rotate/replace default secret in examples to a placeholder that forces change. + +P1 — High risk, serious but not immediately blocking + +- Sensitive fallback modes: get_tenant_context_optional will allow fallback to global RAG unless MULTI_TENANT_STRICT_MODE is true. Be explicit in config and document the leakage risk. Default is to allow fallback; recommend setting default to strict=true for enterprise builds and document consequences. + +- Super-admin handling: SUPER_ADMIN_USERS is read from env/config and defaults to "admin". Empty env var means no super-admins — but code treats absence as allowing a default 'admin'. Clarify and make explicit: empty => no super-admins; set to '*' for wide admin (explicit opt-in). + +P2 — Improvements and hygiene + +- Authorization logging: dependencies.py and auth.py add debug/warning logs. Avoid logging full JWTs or long token prefixes; ensure logs are safe and don't leak PII. + +- Middleware subdomain parsing is naive and can accidentally interpret hosts. Add allowlist of trusted domains and thorough unit tests for host parsing edge-cases (IP, localhost, proxies, forwarded headers). + +- validate_identifier / validate_working_directory — good protections. Ensure storage systems that accept slugs vs UUIDs are consistent and documented. + +Testing required (minimum): + +- RLS + session binding tests (P0) + +- Permission/role matrix tests (get_tenant_context, check_permission, get_admin_context) for admin/editor/viewer/owner (P1) + +- Tenant lifecycle and membership flows (create_tenant add_user_to_tenant remove/update membership) (P1) + +Deployment notes: + +- During deployment, provision a migrations step to install tenants/knowledge_bases tables and membership function. Tie the DB connection pool to a per-request session variable setter. + +- Run an RLS smoke test immediately after deploying (create two tenants, insert records with different tenant_id, verify queries cannot cross-access when app.current_tenant is set accordingly). diff --git a/docs/diff_hku/summary.md b/docs/diff_hku/summary.md new file mode 100644 index 00000000..83a92927 --- /dev/null +++ b/docs/diff_hku/summary.md @@ -0,0 +1,31 @@ +# Top-level summary (1 page) + +Scope: compare HEAD (this copy / local branch) against upstream HKUDS/LightRAG `upstream/main`. + +What changed — short bullet map: + +- Multi-tenant framework introduced across the stack (API dependencies, middleware, models, service layer, tenant-aware RAG manager, DB helper scripts). +- Security hardening (strict multi-tenant mode, role checks, validations, postgres RLS script, explicit permission checks). +- New fastapi dependencies, tenant/membership APIs, and many documentation and e2e test additions. +- Behavioral and defaults changes: LLM & embedding bindings defaults changed (ollama -> openai), parsing improvements for test frameworks, auth logging. +- Large public asset removal from webui bundle — indicates a release tidy-up or rework of front-end assets. + +High-level impact (quick): + +- Functionality: This version adds full multi-tenant support (tenant + KB lifecycle + RBAC) and tenant-scoped RAG instances. + +- Security: Adds critical fixes (RLS, permission checks, tenant isolation). Good progress — but requires DB migrations to be present and tests that ensure RLS + function-based access validation are in place. + +- Backwards compatibility: The code attempts compatibility (optional modes, fallback legacy flows) but DEFAULTS changed (LLM binding), and some endpoints remain public — review before production. + +Recommended immediate priorities: + +1. Add/verify DB migrations for has_tenant_access, user_tenant_memberships, tenants, knowledge_bases; test RLS policies on a staging DB before production. (High) + +2. Rotate default secrets and remove the documented default token secret — ensure environment variables are required or fail-safe. (High) + +3. Add automated multi-tenant isolation tests that run against Postgres with RLS enabled. (High) + +4. Confirm non-tenant endpoints remain safe (public listing endpoints, login flows) and log audit events. (Medium) + +Next: see technical_diffs.md for file-level notes, security_audit.md for prioritized security items, and migration_guide.md for concrete deploy steps. diff --git a/docs/diff_hku/technical_diffs.md b/docs/diff_hku/technical_diffs.md new file mode 100644 index 00000000..f2beeaa5 --- /dev/null +++ b/docs/diff_hku/technical_diffs.md @@ -0,0 +1,52 @@ +# Technical diffs — concise file-level audit + +Below are the most impactful changes and why they matter. Each entry has: file, what changed, immediate risk/impact and recommended next action. + + +- lightrag/api/dependencies.py (NEW) + - Adds tenant extraction logic, token+API-key handling, default resolution and a set of dependencies: get_tenant_context, get_tenant_context_optional, get_tenant_context_no_kb, check_permission, get_admin_context. + - Impact: central piece enabling tenant isolation. Critical path for auth and RBAC. + - Note: heavy use of request.app.state.rag_manager and tenant_service — the presence and shape must match runtime wiring. + - Action: add unit tests for each dependency (happy and failure paths), and document API headers (X-Tenant-ID, X-KB-ID, X-API-Key). + +- lightrag/api/middleware/tenant.py (NEW) + - Sets request.state.tenant_id early (subdomain or JWT reading). Non-blocking on invalid token (delegates to dependencies). + - Impact: helpful early-set context; but missing domain allowlist and safe checks for public domain names; risks if subdomain parsing is naive. + - Action: add config-driven allowed domains and test for IP/localhost cases. + +- lightrag/models/tenant.py (NEW) + - Domain models for Tenant, KnowledgeBase, configs, role/permission mapping. + - Impact: canonicalizes multi-tenant metadata and default configuration; important for governance. + - Action: add JSON schema/pydantic view and unit tests for to_dict and default values. + +- lightrag/services/tenant_service.py (NEW) + - Implements tenant/KB lifecycle, membership, RBAC checks, and Postgres-backed queries when available. + - Impact: critical authorization and multi-tenant CRUD. Calls to DB functions (has_tenant_access) must exist in DB. + - Risk: error handling assumes query shapes; must test against asyncpg row shapes (Record vs arrays). + - Action: add DB migrations and tests verifying has_tenant_access and membership queries; add strong integration tests for permissions. + +- lightrag/tenant_rag_manager.py (NEW) + - Tenant-scoped LightRAG instance manager with LRU caching, creation, initialization and eviction. + - Impact: performance & isolation — per-tenant storage paths, careful eviction required to avoid resource leaks. + - Action: add unit tests for LRU eviction and concurrency (race conditions), measure memory and file handle usage under load. + +- lightrag/kg/postgres_rls.sql (NEW) + - Adds RLS policies to core tables and helper to set tenant context using set_config('app.current_tenant',..). + - Impact: strong DB-level defense-in-depth preventing cross-tenant reads if used end-to-end. + - Risk: RLS will break existing queries unless session variable is set for each DB connection; migrations and DB driver instrumentation required. + - Action: create a migration and instrument DB connection setup to set session variable immediately per request. Add integration tests. + +- lightrag/api/config.py (modified) + - New flags: MULTI_TENANT_STRICT_MODE, REQUIRE_USER_AUTH, SUPER_ADMIN_USERS; default LLM/embedding binding changed to openai. + - Impact: default behaviour changes, risk of breaking pre-existing deployments expecting ollama; security features toggled by env. + - Action: update README and examples; emphasize env vars and warn about default model changes. + +- lightrag/api/auth.py (modified) + - Adds logging for token validation problems, but still uses default token secret if not changed. + - Impact: better debugging but still risky defaults. + - Action: require secure token secrets in prod and document default is temporary only for local dev. + +- Other: many docs, e2e tests, routers and API surface expanded (tenant_routes.py, membership_routes.py, admin_routes.py), asset removals from webui + - Impact: many new surfaces — need a test matrix for public vs tenant-limited endpoints. + +Summary: The local branch adds a complete multi-tenant subsystem with DB-level RLS and RBAC. The changes are substantive and beneficial for enterprise deployment, but require database migrations, configuration, runtime wiring and robust tests before release. diff --git a/docs/diff_hku/tests_needed.md b/docs/diff_hku/tests_needed.md new file mode 100644 index 00000000..4681a969 --- /dev/null +++ b/docs/diff_hku/tests_needed.md @@ -0,0 +1,30 @@ + +# Tests & acceptance checklist (minimal but actionable) + +Security & isolation + +- [ ] RLS enforcement smoke-test (PG): set session app.current_tenant -> verify cross-tenant reads are blocked + +- [ ] DB function has_tenant_access unit tests + integration tests using expected query shapes + +- [ ] get_tenant_context / get_tenant_context_optional tests verifying header precedence, missing token behavior, and strict-mode behavior + +Functional + +- [ ] Tenant lifecycle: create tenant, add user as owner, create KB, add documents, delete KB, delete tenant + +- [ ] Membership actions: add/remove/update membership and verify effects on access + +- [ ] TenantRAGManager: concurrency test that spawns multiple tasks requesting same tenant/kb and verifies a single instance created then evicted properly. + +E2E + +- [ ] Run provided e2e tests under e2e/ with a Postgres instance that includes the migrations and RLS applied. Verify all pass. + +- [ ] Negative tests: attempt to access KB from different tenant using crafted token and assert HTTP 403. + +Operational / CI + +- [ ] Add CI jobs that run the above tests using a disposable Postgres service in GitHub Actions or local Docker Compose. + +If tests above pass -> safe to roll out to staging. If any P0 test fails, block release and revert until fixed. diff --git a/docs/diff_hku/unmerged_upstream.md b/docs/diff_hku/unmerged_upstream.md new file mode 100644 index 00000000..e1ccceb2 --- /dev/null +++ b/docs/diff_hku/unmerged_upstream.md @@ -0,0 +1,81 @@ +# Upstream (HKUDS/main @ f0d67f16) — features & fixes missing from this version + +Summary: this file lists new features, bug fixes, CI/docs/tooling changes and important refactors that exist in upstream `HKUDS/LightRAG` main (commit f0d67f16 and its ancestors) but are not merged into this local branch. I grouped changes into functional areas and included short remediation notes where appropriate. + +NOTE: upstream/main contains many small dependency bumps and documentation commits; this document focuses on substantive features and functional fixes that affect runtime behavior, storage, security, tooling and testing. + +1) Core storage, DB & Postgres improvements +- Add PostgreSQL vchordrq vector index support and unify vector index creation logic (dev-postgres-vchordrq) — improves Postgres vector indexing semantics and config handling. +- Add CASCADE to AGE extension creation in Postgres init scripts (avoid failures when recreating extension) +- Add postgres_impl fixes, retry improvements and support for vchordrq epsilon config when probes empty +- Postgres RLS-related and storage refinements (note: local branch already added postgres_rls.sql; upstream brings complementary DB/VECTOR engine improvements and fixes). Remediation: merge upstream postgres/vchordrq changes and ensure migration scripts align. + +2) Chunking, indexing and document ingestion fixes +- Fix top_n behavior: limit by documents instead of chunks to avoid over-counting. (important for retrieval ranking) +- Fix infinite loop when overlap_tokens >= max_tokens and edge-case handling for max_tokens == 1. +- Add comprehensive tests for chunking logic (multi-token tokenizer, recursive split) and chunking parameters tuning. +- Add content deduplication check for document insertion endpoints and fix duplicate document response handling to return original track_id. (prevents duplicates and preserves original IDs) + +3) Embeddings & LLM / cloud provider support improvements +- Major improvements in OpenAI/OLLAMA/Azure/Bedrock embedding wrappers and clients: + - Allow embedding provider defaults when unspecified + - Add configurable embedding token limits and validation + - Fix Azure OpenAI compatibility and support various deployments, fallback to AZURE_OPENAI_API_VERSION + - Convert OpenAI client to use a stable API and bump minimum version (>=2.0.0) + - Add support for structured OpenAI outputs via parsed field + - Improve Bedrock error handling and add retry logic/custom exceptions + - Additional refactors for embedding function wrapping rules, model param handling and function attribute inheritance + - Add helper flags like configurable model parameter to jina_embed + - Support async chunking functions for large, async chunkers + - Add new LLM support, additions under lightrag/llm (e.g., gemini file added upstream) + +4) Document / file extraction improvements +- DOCX/XLSX handling fixes (preserve table structure, whitespace, column alignment; optimize memory use) +- Replace PyPDF2 with pypdf for PDF processing (faster, more reliable parsing) + +5) Workspace isolation, pipeline status, RAG lifecycle fixes +- Fix document deletion concurrency control and auto-acquire pipeline when idle. +- Auto-initialize pipeline status on LightRAG.initialize_storages() (reduces error-prone manual calls) +- Namespace, workspace handling and locking fixes: improvements to NamespaceLock (ContextVar), default workspace handling, filtering logic, consistent empty workspace handling and many concurrency bug fixes. + +6) Web UI — upgrades, feature additions, fixes +- Large set of dependency upgrades for `lightrag_webui` (vite, react-i18next, plugin-react-swc, syntax highlighter, etc.). Upstream also cleaned duplicate deps and improved build tooling. +- Add new UI components / improvements (MergeDialog, graph features, translations updates, many components updated). +- Handle missing WebUI assets gracefully so server startup is not blocked. +- Add static swagger UI assets for API docs (swagger-ui files added upstream). + +7) CI, testing, and developer tooling +- New/updated GitHub workflows and test runners: tests.yml, improved offline/integration CI markers, Copilot setup steps, docker-build* workflows and improved GitHub Actions versions. +- Drop older Python versions in test matrices (3.10/3.11 removed; 3.13/3.14 added) — keep CI modern. +- Add ruff to pytest extras, add pre-commit hooks and refine pytest fixtures and markers. +- Add many new tests including workspace isolation, chunking tests, overlap validation, postgres retry integration tests, rerank chunking tests, and E2E test improvements. + +8) Tools & CLI +- New helper tools: clean_llm_query_cache.py, migrate_llm_cache.py, download_cache.py and related README docs for cleaning/migrating LLM caches. +- Add `lightrag-clean-llmqc` console script entrypoint. + +9) Docs & deployment support +- Added docs: FrontendBuildGuide.md, OfflineDeployment.md, UV_LOCK_GUIDE.md and evaluation assets. +- Added Dockerfile.lite and docker-build-push.sh to support smaller builds and multi-format distribution. + +10) KaTeX & math / feature parity +- Upstream adds KaTeX copy‑tex extension support and mhchem extension for chemistry formulas (enables better formula copying and chemistry rendering). Also fixed KaTeX loading in startup. + +11) JSON, sanitization and performance +- Multiple JSON write/sanitizer enhancements (specialized sanitizers to handle tuples/dict keys/UTF8 errors, optimize sanitization performance) and fixes to avoid memory corruption on migrations. + +12) Cloud model & misc improvements +- Improve cloud model detection/safety, macOS fork-safety check for Gunicorn multiworker cases; many small fixes for cloud model defaults and config. + +13) Security / dependency hygiene +- Remove future dependency and replace passlib usage with direct bcrypt (adopt modern libs) + +Actionable remediation checklist (priority): +- Merge and test upstream changes that affect: chunking, embeddings/LLM wrappers, doc processing, and Postgres vector indexing + RLS compatibility (High). +- Add or adapt DB migrations to incorporate any upstream schema changes required by tenant features and ensure no conflicts (High). +- Update CI matrix and tests to incorporate upstream tests (esp. workspace isolation and chunking tests) to verify no regressions (High). +- Merge Web UI updates separately behind feature flag/workflow (Medium) — major dependency churn. + +If you want, I can automatically generate: +- a full commit-by-commit list (746 commits) in docs/diff_hku/unmerged_upstream_commits.txt (raw) — useful for exhaustive audit. +- cherry-pick safe/high-priority upstream commits onto this branch and prepare a candidate PR with resolved conflicts. diff --git a/docs/diff_hku/unmerged_upstream_commits.txt b/docs/diff_hku/unmerged_upstream_commits.txt new file mode 100644 index 00000000..e633f8d4 --- /dev/null +++ b/docs/diff_hku/unmerged_upstream_commits.txt @@ -0,0 +1,747 @@ +f0d67f16 2025-12-03 yangdx Merge branch 'cohere-rerank' +9009abed 2025-12-03 yangdx Fix top_n behavior with chunking to limit documents not chunks +561ba4e4 2025-12-03 yangdx Fix trailing whitespace and update test mocking for rerank module +8e50eef5 2025-12-02 yangdx Merge branch 'main' into cohere-rerank +64760216 2025-12-02 yangdx Configure Dependabot schedule with specific times and timezone +6e2f125a 2025-12-02 Daniel.y Merge pull request #2471 from HKUDS/dependabot/bun/lightrag_webui/frontend-minor-patch-172e1e6fcf +ddd32f58 2025-12-02 Daniel.y Merge pull request #2470 from HKUDS/dependabot/bun/lightrag_webui/build-tools-939f50a5f3 +68bee74d 2025-12-02 dependabot[bot] Bump the frontend-minor-patch group in /lightrag_webui with 2 updates +7545fa72 2025-12-02 dependabot[bot] Bump vite in /lightrag_webui in the build-tools group +13fc9f33 2025-12-02 yangdx Reduce dependabot open pull request limits +4c775ec5 2025-12-02 Daniel.y Merge pull request #2469 from danielaskdd/fix-track-id +ed22e094 2025-12-02 yangdx Merge branch 'main' into fix-track-id +19c16bc4 2025-12-02 yangdx Add content deduplication check for document insertion endpoints +1f875122 2025-12-02 yangdx Drop Python 3.10 and 3.11 from CI test matrix +8d28b959 2025-12-02 yangdx Fix duplicate document responses to return original track_id +381ddfff 2025-12-02 yangdx Bump API version to 0259 +19cae272 2025-12-02 Daniel.y Merge pull request #2463 from HKUDS/dependabot/bun/lightrag_webui/types/node-24.9.2 +dd95813f 2025-12-02 Daniel.y Merge pull request #2465 from HKUDS/dependabot/bun/lightrag_webui/react-markdown-10.1.0 +d5e7b230 2025-12-02 dependabot[bot] Bump @types/node from 22.18.9 to 24.9.2 in /lightrag_webui +0d89cd26 2025-12-02 Daniel.y Merge pull request #2462 from HKUDS/dependabot/bun/lightrag_webui/faker-js/faker-10.1.0 +a47414f7 2025-12-02 Daniel.y Merge pull request #2461 from HKUDS/dependabot/bun/lightrag_webui/sonner-2.0.7 +dd4c988b 2025-12-02 dependabot[bot] Bump react-markdown from 9.1.0 to 10.1.0 in /lightrag_webui +67d9455c 2025-12-02 Daniel.y Merge pull request #2466 from HKUDS/dependabot/bun/lightrag_webui/react-i18next-16.2.3 +57d9cc8f 2025-12-02 Daniel.y Merge pull request #2464 from HKUDS/dependabot/bun/lightrag_webui/react-syntax-highlighter-16.1.0 +e20f86a0 2025-12-02 dependabot[bot] Bump react-i18next from 15.7.4 to 16.2.3 in /lightrag_webui +b38f4dd7 2025-12-02 dependabot[bot] Bump react-syntax-highlighter from 15.6.6 to 16.1.0 in /lightrag_webui +e547c003 2025-12-02 Daniel.y Merge pull request #2460 from HKUDS/dependabot/bun/lightrag_webui/vite-7.1.12 +b2b5f80b 2025-12-02 Daniel.y Merge pull request #2467 from HKUDS/dependabot/bun/lightrag_webui/build-tools-ecae90f21c +e429d553 2025-12-02 Daniel.y Merge pull request #2459 from HKUDS/dependabot/bun/lightrag_webui/frontend-minor-patch-9aaf02af10 +7f7ce9d3 2025-12-02 dependabot[bot] Bump i18next in /lightrag_webui in the frontend-minor-patch group +d3b5cb63 2025-12-02 dependabot[bot] Bump vite from 6.3.6 to 7.1.12 in /lightrag_webui +964b53e7 2025-12-02 Daniel.y Merge pull request #2458 from HKUDS/dependabot/bun/lightrag_webui/eslint-plugin-react-hooks-7.0.1 +2bb9ec13 2025-12-02 dependabot[bot] Bump eslint-plugin-react-hooks from 5.2.0 to 7.0.1 in /lightrag_webui +29bd027a 2025-12-02 dependabot[bot] Bump @vitejs/plugin-react-swc +a8e79a8a 2025-12-02 yangdx Merge remote-tracking branch 'upstream/dependabot/bun/lightrag_webui/react-error-boundary-6.0.0' +5ca4792c 2025-12-02 dependabot[bot] Bump @faker-js/faker from 9.9.0 to 10.1.0 in /lightrag_webui +0ca71a57 2025-12-02 dependabot[bot] Bump sonner from 1.7.4 to 2.0.7 in /lightrag_webui +ea826a38 2025-12-02 yangdx Merge branch 'dependabot/bun/lightrag_webui/vitejs/plugin-react-swc-4.2.0' +0f045a52 2025-12-02 yangdx Merge branch 'dependabot/bun/lightrag_webui/vitejs/plugin-react-swc-4.2.0' of github.com:HKUDS/LightRAG into dependabot/bun/lightrag_webui/vitejs/plugin-react-swc-4.2.0 +0c2a653c 2025-12-02 yangdx Merge branch 'main' into dependabot/bun/lightrag_webui/vitejs/plugin-react-swc-4.2.0 +bd487a45 2025-12-02 dependabot[bot] Bump @vitejs/plugin-react-swc from 3.11.0 to 4.2.0 in /lightrag_webui +59b1b58f 2025-12-02 Daniel.y Merge pull request #2456 from HKUDS/dependabot/bun/lightrag_webui/globals-16.5.0 +8cdf8a12 2025-12-02 Daniel.y Merge pull request #2455 from HKUDS/dependabot/bun/lightrag_webui/i18next-25.6.0 +09aa8483 2025-12-02 yangdx Merge branch 'dependabot/bun/lightrag_webui/stylistic/eslint-plugin-js-4.4.1' +42b09b10 2025-12-02 dependabot[bot] Bump globals from 15.15.0 to 16.5.0 in /lightrag_webui +883c5dc0 2025-12-02 yangdx Update dependabot config with new groupings and patterns +1d12f497 2025-12-02 dependabot[bot] Bump i18next from 24.2.3 to 25.6.0 in /lightrag_webui +459e4ddc 2025-12-02 yangdx Clean up duplicate dependencies in package.json and lock file +e7966712 2025-12-02 Daniel.y Merge pull request #2452 from HKUDS/dependabot/bun/lightrag_webui/frontend-minor-patch-a28ecac770 +c6c201d7 2025-12-02 Daniel.y Merge branch 'main' into dependabot/bun/lightrag_webui/frontend-minor-patch-a28ecac770 +13a285d4 2025-12-02 Daniel.y Merge pull request #2451 from HKUDS/dependabot/bun/lightrag_webui/build-tools-0944ec6cea +35c79341 2025-12-02 Daniel.y Merge pull request #2450 from HKUDS/dependabot/bun/lightrag_webui/ui-components-018be29f1c +ab718218 2025-12-02 Daniel.y Merge pull request #2449 from HKUDS/dependabot/bun/lightrag_webui/react-b0cb288b9e +9425277f 2025-12-02 yangdx Improve dependabot config with better docs and numpy ignore rule +9ae1c7fc 2025-12-01 dependabot[bot] Bump react-error-boundary from 5.0.0 to 6.0.0 in /lightrag_webui +e2431b67 2025-12-01 dependabot[bot] Bump @vitejs/plugin-react-swc from 3.11.0 to 4.2.0 in /lightrag_webui +1f3d7006 2025-12-01 dependabot[bot] Bump @stylistic/eslint-plugin-js from 3.1.0 to 4.4.1 in /lightrag_webui +f4acb25c 2025-12-01 dependabot[bot] Bump the frontend-minor-patch group in /lightrag_webui with 6 updates +245c0c32 2025-12-01 dependabot[bot] Bump the build-tools group in /lightrag_webui with 4 updates +15bfd9fa 2025-12-01 dependabot[bot] Bump the ui-components group in /lightrag_webui with 7 updates +587a930b 2025-12-01 dependabot[bot] Bump the react group in /lightrag_webui with 3 updates +445adfc9 2025-12-02 yangdx Add name to lint-and-format job in GitHub workflow +d0509d6f 2025-12-02 Daniel.y Merge pull request #2448 from HKUDS/dependabot/github_actions/github-actions-b6ffb444c9 +b2f1de4a 2025-12-02 Daniel.y Merge pull request #2447 from danielaskdd/dependabot +f93bda58 2025-12-02 yangdx Enable numpy updates in dependabot configuration +88357675 2025-12-01 dependabot[bot] Bump the github-actions group with 7 updates +0f19f80f 2025-12-02 yangdx Configure comprehensive Dependabot for Python and frontend dependencies +ecef842c 2025-12-02 yangdx Update GitHub Actions to use latest versions (v6) +6fee81f5 2025-12-02 Daniel.y Merge pull request #2435 from cclauss/patch-1 +27805b9a 2025-12-02 Daniel.y Merge pull request #2436 from cclauss/patch-2 +268e4ff6 2025-12-02 yangdx Refactor dependencies and add test extra in pyproject.toml +2ecf77ef 2025-12-02 yangdx Update help text to use correct gunicorn command with workers flag +fc44f113 2025-12-02 yangdx Remove future dependency and replace passlib with direct bcrypt +48b6a6df 2025-12-02 Daniel.y Merge pull request #2446 from danielaskdd/fix-postgres +d6019c82 2025-12-02 yangdx Add CASCADE to AGE extension creation in PostgreSQL implementation +607c11c0 2025-12-01 Daniel.y Merge pull request #2443 from danielaskdd/fix-ktax +3f6423df 2025-12-01 yangdx Fix KaTeX extension loading by moving imports to app startup +112ed234 2025-12-01 yangdx Bump API version to 0258 +8f4bfbf1 2025-12-01 yangdx Add KaTeX copy-tex extension support for formula copying +aeaa0b32 2025-12-01 yangdx Add mhchem extension support for chemistry formulas in ChatMessage +0aa77fdb 2025-11-30 chaohuang-ai Merge pull request #2439 from HKUDS/chaohuang-ai-patch-1 +5c964267 2025-11-30 chaohuang-ai Update README.md +d2ab7fb2 2025-11-28 Christian Clauss Add Python 3.13 and 3.14 to the testing +90e38c20 2025-11-28 Christian Clauss Keep GitHub Actions up to date with GitHub's Dependabot +8eb63d9b 2025-11-28 Daniel.y Merge pull request #2434 from cclauss/patch-1 +b6705449 2025-11-28 Daniel.y Merge pull request #2433 from danielaskdd/fix-jina-embedding +ea8d55ab 2025-11-28 yangdx Add documentation for embedding provider configuration rules +90f341d6 2025-11-28 Christian Clauss Fix typos discovered by codespell +4ab4a7ac 2025-11-28 yangdx Allow embedding models to use provider defaults when unspecified +881b8d3a 2025-11-28 yangdx Bump API version to 0257 +56e0365c 2025-11-28 yangdx Add configurable model parameter to jina_embed function +1b02684e 2025-11-28 Daniel.y Merge pull request #2432 from danielaskdd/embedding-example +97a9dfca 2025-11-28 yangdx Add important note about embedding function wrapping restrictions +1d07ff7f 2025-11-28 yangdx Update OpenAI and Ollama embedding func examples in README +6e2946e7 2025-11-28 yangdx Add max_token_size parameter to azure_openai_embed wrapper +4f12fe12 2025-11-27 yangdx Change entity extraction logging from warning to info level +a898f054 2025-11-25 palanisd Merge branch 'HKUDS:main' into cohere-rerank +93d445df 2025-11-25 yangdx Add pipeline status lock function for legacy compatibility +d2cd1c07 2025-11-25 Daniel.y Merge pull request #2421 from EightyOliveira/fix_catch_order +777c9179 2025-11-25 yangdx Add Langfuse observability configuration to env.example +8994c70f 2025-11-25 EightyOliveira fix:exception handling order error +2539b4e2 2025-11-25 Daniel.y Merge pull request #2418 from danielaskdd/start-without-webui +48b67d30 2025-11-25 yangdx Handle missing WebUI assets gracefully without blocking server startup +2832a2ca 2025-11-25 Daniel.y Merge pull request #2417 from danielaskdd/neo4j-retry +5f91063c 2025-11-25 yangdx Add ruff as dependency to pytest and evaluation extras +8c4d7a00 2025-11-25 yangdx Refactor: Extract retry decorator to reduce code duplication in Neo4J storage +5b81ef00 2025-11-24 Daniel.y Merge pull request #2410 from netbrah/create-copilot-setup-steps +7aaa51cd 2025-11-24 yangdx Add retry decorators to Neo4j read operations for resilience +dd18eb5b 2025-11-24 palanisd Merge pull request #3 from netbrah/copilot/fix-overlap-tokens-validation +8835fc24 2025-11-24 copilot-swe-agent[bot] Improve edge case handling for max_tokens=1 +1d6ea0c5 2025-11-24 copilot-swe-agent[bot] Fix chunking infinite loop when overlap_tokens >= max_tokens +e136da96 2025-11-24 copilot-swe-agent[bot] Initial plan +c233da63 2025-11-23 palanisd Update copilot-setup-steps.yml +a05bbf10 2025-11-22 netbrah Add Cohere reranker config, chunking, and tests +1b0413ee 2025-11-22 palanisd Create copilot-setup-steps.yml +16eb0d5b 2025-11-23 chaohuang-ai Merge pull request #2409 from HKUDS/chaohuang-ai-patch-3 +37178462 2025-11-23 chaohuang-ai Update README.md +6d3bfe46 2025-11-23 chaohuang-ai Merge pull request #2408 from HKUDS/chaohuang-ai-patch-2 +babbcb56 2025-11-23 chaohuang-ai Update README.md +5f53de88 2025-11-22 yangdx Fix Azure configuration examples and correct typos in env.example +fa6797f2 2025-11-22 yangdx Update env.example +49fb11e2 2025-11-22 yangdx Update Azure OpenAI configuration examples +7b762110 2025-11-22 yangdx Add fallback to AZURE_OPENAI_API_VERSION for embedding API version +ffd8da51 2025-11-21 yangdx Improve Azure OpenAI compatibility and error handling +fafa1791 2025-11-21 yangdx Fix Azure OpenAI model parameter to use deployment name consistently +021b637d 2025-11-21 Daniel.y Merge pull request #2403 from danielaskdd/azure-cot-handling +ac9f2574 2025-11-21 yangdx Improve Azure OpenAI wrapper functions with full parameter support +45f4f823 2025-11-21 yangdx Refactor Azure OpenAI client creation to support client_configs merging +0c4cba38 2025-11-21 yangdx Fix double decoration in azure_openai_embed and document decorator usage +b46c1523 2025-11-21 yangdx Fix linting +b709f8f8 2025-11-21 yangdx Consolidate Azure OpenAI implementation into main OpenAI module +66d6c7dd 2025-11-21 yangdx Refactor main function to provide sync CLI entry point +8777895e 2025-11-21 Daniel.y Merge pull request #2401 from danielaskdd/fix-openai-keyword-extraction +1e477e95 2025-11-21 yangdx Add lightrag-clean-llmqc console script entry point +02fdceb9 2025-11-21 yangdx Update OpenAI client to use stable API and bump minimum version to 2.0.0 +9f69c5bf 2025-11-21 yangdx feat: Support structured output `parsed` from OpenAI +c9e1c86e 2025-11-21 yangdx Refactor keyword extraction handling to centralize response format logic +46ce6d9a 2025-11-20 yangdx Fix Azure OpenAI embedding model parameter fallback +cc78e2df 2025-11-20 Daniel.y Merge pull request #2395 from Amrit75/issue-2394 +30e86fa3 2025-11-20 Amritpal Singh use deployment variable which extracted value from .env file or have default value +ecea9399 2025-11-20 yangdx Fix lingting +1d2f534f 2025-11-20 yangdx Fix linting +72ece734 2025-11-20 yangdx Remove obsolete config file and paging design doc +1e415cff 2025-11-20 yangdx Update postgreSQL docker image link +3c85e488 2025-11-20 yangdx Update README +d52adb64 2025-11-19 Daniel.y Merge pull request #2390 from danielaskdd/fix-pytest-logging-error +b7de694f 2025-11-19 yangdx Add comprehensive error logging across API routes +0fb2925c 2025-11-19 yangdx Remove ascii_colors dependency and fix stream handling errors +f72f435c 2025-11-19 Daniel.y Merge pull request #2389 from danielaskdd/fix-chunk-size +fec7c67f 2025-11-19 yangdx Add comprehensive chunking tests with multi-token tokenizer edge cases +57332925 2025-11-19 yangdx Add comprehensive tests for chunking with recursive splitting +6fea68bf 2025-11-19 yangdx Fix ChunkTokenLimitExceededError message formatting +f988a226 2025-11-19 yangdx Add token limit validation for character-only chunking +5cc91686 2025-11-19 yangdx Expand AGENTS.md with testing controls and automation guidelines +af4d2a3d 2025-11-19 Daniel.y Merge pull request #2386 from danielaskdd/excel-optimization +95cd0ece 2025-11-19 yangdx Fix DOCX table extraction by escaping special characters in cells +87de2b3e 2025-11-19 yangdx Update XLSX extraction documentation to reflect current implementation +0244699d 2025-11-19 yangdx Optimize XLSX extraction by using sheet.max_column instead of two-pass scan +2b160163 2025-11-19 yangdx Optimize XLSX extraction to avoid storing all rows in memory +ef659a1e 2025-11-19 yangdx Preserve column alignment in XLSX extraction with two-pass processing +3efb1716 2025-11-19 yangdx Enhance XLSX extraction with structured tab-delimited format and escaping +efbbaaf7 2025-11-19 Daniel.y Merge pull request #2383 from danielaskdd/doc-table +e7d2803a 2025-11-19 yangdx Remove text stripping in DOCX extraction to preserve whitespace +186c8f0e 2025-11-19 yangdx Preserve blank paragraphs in DOCX extraction to maintain spacing +fa887d81 2025-11-19 yangdx Fix table column structure preservation in DOCX extraction +4438ba41 2025-11-19 yangdx Enhance DOCX extraction to preserve document order with tables +d16c7840 2025-11-18 yangdx Bump API version to 0256 +e77340d4 2025-11-18 yangdx Adjust chunking parameters to match the default environment variable settings +24423c92 2025-11-18 yangdx Merge branch 'fix_chunk_comment' +1bfa1f81 2025-11-18 yangdx Merge branch 'main' into fix_chunk_comment +9c10c875 2025-11-18 yangdx Fix linting +9109509b 2025-11-18 yangdx Merge branch 'dev-postgres-vchordrq' +dbae327a 2025-11-18 yangdx Merge branch 'main' into dev-postgres-vchordrq +b583b8a5 2025-11-18 yangdx Merge branch 'feature/postgres-vchordrq-indexes' into dev-postgres-vchordrq +3096f844 2025-11-18 yangdx fix(postgres): allow vchordrq.epsilon config when probes is empty +dacca334 2025-11-18 EightyOliveira refactor(chunking): rename params and improve docstring for chunking_by_token_size +f4bf5d27 2025-11-18 wmsnp fix: add logger to configure_vchordrq() and format code +dfbc9736 2025-11-18 Daniel.y Merge pull request #2369 from HKUDS/workspace-isolation +702cfd29 2025-11-18 yangdx Fix document deletion concurrency control and validation logic +656025b7 2025-11-18 yangdx Rename GitHub workflow from "Tests" to "Offline Unit Tests" +7e9c8ed1 2025-11-18 yangdx Rename test classes to prevent warning from pytest +4048fc4b 2025-11-18 yangdx Fix: auto-acquire pipeline when idle in document deletion +1745b30a 2025-11-18 yangdx Fix missing workspace parameter in update flags status call +f8dd2e07 2025-11-18 yangdx Fix namespace parsing when workspace contains colons +472b498a 2025-11-18 yangdx Replace pytest group reference with explicit dependencies in evaluation +a11912ff 2025-11-18 yangdx Add testing workflow guidelines to basic development rules +41bf6d02 2025-11-18 yangdx Fix test to use default workspace parameter behavior +d07023c9 2025-11-18 wmsnp feat(postgres_impl): add vchordrq vector index support and unify vector index creation logic +4ea21240 2025-11-18 yangdx Add GitHub CI workflow and test markers for offline/integration tests +4fef731f 2025-11-18 yangdx Standardize test directory creation and remove tempfile dependency +1fe05df2 2025-11-18 yangdx Refactor test configuration to use pytest fixtures and CLI options +6ae0c144 2025-11-18 yangdx test: add concurrent execution to workspace isolation test +6cef8df1 2025-11-18 yangdx Reduce log level and improve workspace mismatch message clarity +fc9f7c70 2025-11-18 yangdx Fix linting +f83b475a 2025-11-18 yangdx Remove Dependabot configuration file +21ad990e 2025-11-18 yangdx Improve workspace isolation tests with better parallelism checks and cleanup +5da82bb0 2025-11-18 yangdx Add pre-commit to pytest dependencies and format test code +99262ada 2025-11-18 yangdx Enhance workspace isolation test with distinct mock data and persistence +b7b8d156 2025-11-17 yangdx Refactor pytest dependencies into separate optional group +1874cfaf 2025-11-17 yangdx Fix linting +3806892a 2025-11-17 Daniel.y Merge pull request #2371 from BukeLy/pytest-style-conversion +1a183702 2025-11-17 BukeLy docs: Update test file docstring to reflect all 11 test scenarios +3ec73693 2025-11-17 BukeLy test: Enhance E2E workspace isolation detection with content verification +a990c1d4 2025-11-17 BukeLy fix: Correct Mock LLM output format in E2E test +288498cc 2025-11-17 BukeLy test: Convert test_workspace_isolation.py to pytest style +ddc76f0c 2025-11-17 yangdx Merge branch 'main' into workspace-isolation +9262f66d 2025-11-17 yangdx Bump API version to 0255 +393f8803 2025-11-17 yangdx Improve LightRAG initialization checker tool with better usage docs +9d7b7981 2025-11-17 yangdx Add pipeline status validation before document deletion +98e964df 2025-11-17 yangdx Fix initialization instructions in check_lightrag_setup function +6d6716e9 2025-11-17 yangdx Add _default_workspace to shared storage finalization +cf73cb4d 2025-11-17 yangdx Remove unused variables from workspace isolation test +c1ec657c 2025-11-17 yangdx Fix linting +f1d8f18c 2025-11-17 yangdx Merge branch 'main' into workspace-isolation +3e759f46 2025-11-17 BukeLy test: Add real integration and E2E tests for workspace isolation +436e4143 2025-11-17 BukeLy test: Enhance workspace isolation test suite to 100% coverage +4742fc8e 2025-11-17 BukeLy test: Add comprehensive workspace isolation test suite for PR #2366 +cdd53ee8 2025-11-17 yangdx Remove manual initialize_pipeline_status() calls across codebase +e22ac52e 2025-11-17 yangdx Auto-initialize pipeline status in LightRAG.initialize_storages() +e8383df3 2025-11-17 yangdx Fix NamespaceLock context variable timing to prevent lock bricking +95e1fb16 2025-11-17 yangdx Remove final_namespace attribute for in-memory storage and use namespace in clean_llm_query_cache.py +7ed0eac4 2025-11-17 yangdx Fix workspace filtering logic in get_all_update_flags_status +78689e88 2025-11-17 yangdx Fix pipeline status namespace check to handle root case +d54d0d55 2025-11-17 yangdx Standardize empty workspace handling from "_" to "" across storage +b6a5a90e 2025-11-17 yangdx Fix NamespaceLock concurrent coroutine safety with ContextVar +fd486bc9 2025-11-17 yangdx Refactor storage classes to use namespace instead of final_namespace +01814bfc 2025-11-17 yangdx Fix missing function call parentheses in get_all_update_flags_status +7deb9a64 2025-11-17 yangdx Refactor namespace lock to support reusable async context manager +52c812b9 2025-11-17 yangdx Fix workspace isolation for pipeline status across all operations +926960e9 2025-11-17 yangdx Refactor workspace handling to use default workspace and namespace locks +acae404f 2025-11-15 yangdx Update env.example +ec05d89c 2025-11-15 yangdx Add macOS fork safety check for Gunicorn multi-worker mode +8abc2ac1 2025-11-13 Sleeep Update edge keywords extraction in graph visualization +e5addf4d 2025-11-14 yangdx Improve embedding config priority and add debug logging +2fb57e76 2025-11-14 yangdx Fix embedding token limit initialization order +6b2af2b5 2025-11-14 yangdx Refactor embedding function creation with proper attribute inheritance +f0254773 2025-11-14 yangdx Convert embedding_token_limit from property to field with __post_init__ +14a6c24e 2025-11-14 yangdx Add configurable embedding token limit with validation +f5b48587 2025-11-14 yangdx Improve Bedrock error handling with retry logic and custom exceptions +77221564 2025-11-14 yangdx Add max_token_size parameter to embedding function decorators +8283c86b 2025-11-14 yangdx Refactor exception handling in MemgraphStorage label methods +423e4e92 2025-11-14 yangdx Fix null reference errors in graph database error handling +2f2f35b8 2025-11-13 yangdx Add macOS compatibility check for DOCLING with multi-worker Gunicorn +c246eff7 2025-11-13 yangdx Improve docling integration with macOS compatibility and CLI flag +63510478 2025-11-13 yangdx Improve error handling and logging in cloud model detection +67dfd856 2025-11-13 LacombeLouis Add a better regex +5127bf20 2025-11-12 Louis Lacombe Add support for environment variable fallback for API key and default host for cloud models +fa9206d6 2025-11-13 yangdx Update uv.lock +7b7f93d7 2025-11-13 yangdx Implement lazy configuration initialization for API server +69a0b74c 2025-11-13 yangdx refactor: move document deps to api group, remove dynamic imports +7d394fb0 2025-11-13 yangdx Replace asyncio.iscoroutine with inspect.isawaitable for better detection +72f68c2a 2025-11-13 yangdx Update env.example +a08bc726 2025-11-12 yangdx Fix empty dict handling after JSON sanitization +cca0800e 2025-11-12 yangdx Fix migration to reload sanitized data and prevent memory corruption +7f54f470 2025-11-12 yangdx Optimize JSON string sanitization with precompiled regex and zero-copy +f289cf62 2025-11-12 yangdx Optimize JSON write with fast/slow path to reduce memory usage +93a3e471 2025-11-12 yangdx Remove deprecated response_type parameter from query settings +abeaac84 2025-11-12 yangdx Improve JSON data sanitization to handle tuples and dict keys +5885637e 2025-11-12 yangdx Add specialized JSON string sanitizer to prevent UTF-8 encoding errors +23cbb9c9 2025-11-12 yangdx Add data sanitization to JSON writing to prevent UTF-8 encoding errors +ff8f1588 2025-11-11 yangdx Update env.example +c434879c 2025-11-11 yangdx Replace PyPDF2 with pypdf for PDF processing +af542391 2025-11-13 yangdx Support async chunking functions in LightRAG processing pipeline +50160254 2025-11-10 Tong Da easier version: detect chunking_func result is coroutine or not +77405006 2025-11-09 Tong Da support async chunking func to improve processing performance when a heavy `chunking_func` is passed in by user +18a48702 2025-11-15 BukeLy fix: Add default workspace support for backward compatibility +eb52ec94 2025-11-13 BukeLy feat: Add workspace isolation support for pipeline status +8bb54833 2025-11-17 Daniel.y Merge pull request #2368 from danielaskdd/milvus-vector-batching +90f52acf 2025-11-17 yangdx Fix linting +c13f9116 2025-11-17 yangdx Add embedding dimension validation to EmbeddingFunc wrapper +3b76eea2 2025-11-15 Daniel.y Merge pull request #2359 from danielaskdd/embedding-limit +87221035 2025-11-15 yangdx Update env.example +b5589ce4 2025-11-15 yangdx Merge branch 'main' into embedding-limit +9a2ddcee 2025-11-15 Daniel.y Merge pull request #2360 from danielaskdd/macos-gunicorn-numpy +4343db75 2025-11-15 yangdx Add macOS fork safety check for Gunicorn multi-worker mode +c6850ac5 2025-11-14 Daniel.y Merge pull request #2358 from sleeepyin/main +5dec4dea 2025-11-14 yangdx Improve embedding config priority and add debug logging +de4412dd 2025-11-14 yangdx Fix embedding token limit initialization order +963a0a5d 2025-11-14 yangdx Refactor embedding function creation with proper attribute inheritance +39b49e92 2025-11-14 yangdx Convert embedding_token_limit from property to field with __post_init__ +ab4d7ac2 2025-11-14 yangdx Add configurable embedding token limit with validation +680e36c6 2025-11-14 yangdx Improve Bedrock error handling with retry logic and custom exceptions +05852e1a 2025-11-14 yangdx Add max_token_size parameter to embedding function decorators +b88d7854 2025-11-14 Sleeep Merge branch 'HKUDS:main' into main +399a23c3 2025-11-14 Daniel.y Merge pull request #2356 from danielaskdd/improve-error-handling +4401f86f 2025-11-14 yangdx Refactor exception handling in MemgraphStorage label methods +1ccef2b9 2025-11-14 yangdx Fix null reference errors in graph database error handling +c164c8f6 2025-11-13 yangdx Merge branch 'main' of github.com:HKUDS/LightRAG +18893015 2025-11-13 yangdx Merge branch 'feat/add_cloud_ollama_support' +77ad906d 2025-11-13 yangdx Improve error handling and logging in cloud model detection +28fba19b 2025-11-13 Daniel.y Merge pull request #2352 from danielaskdd/docling-gunicorn-multi-worker +cc031a3d 2025-11-13 yangdx Add macOS compatibility check for DOCLING with multi-worker Gunicorn +844537e3 2025-11-13 LacombeLouis Add a better regex +a24d8181 2025-11-13 yangdx Improve docling integration with macOS compatibility and CLI flag +76adde38 2025-11-13 Daniel.y Merge pull request #2351 from danielaskdd/lazy-config-loading +89e63aa4 2025-11-13 Sleeep Update edge keywords extraction in graph visualization +e6588f91 2025-11-13 yangdx Update uv.lock +746c069a 2025-11-13 yangdx Implement lazy configuration initialization for API server +470e2fd1 2025-11-13 Daniel.y Merge pull request #2350 from danielaskdd/reduce-dynamic-import +4b31942e 2025-11-13 yangdx refactor: move document deps to api group, remove dynamic imports +87659744 2025-11-13 yangdx Merge branch 'tongda/main' +c230d1a2 2025-11-13 yangdx Replace asyncio.iscoroutine with inspect.isawaitable for better detection +297e4607 2025-11-13 yangdx Merge branch 'main' into tongda/main +940bec0b 2025-11-13 yangdx Support async chunking functions in LightRAG processing pipeline +343d3072 2025-11-13 yangdx Update env.example +f7432a26 2025-11-12 Louis Lacombe Add support for environment variable fallback for API key and default host for cloud models +075399ff 2025-11-12 Daniel.y Merge pull request #2346 from danielaskdd/optimize-json-sanitization +70cc2419 2025-11-12 yangdx Fix empty dict handling after JSON sanitization +dcf1d286 2025-11-12 yangdx Fix migration to reload sanitized data and prevent memory corruption +6de4123f 2025-11-12 yangdx Optimize JSON string sanitization with precompiled regex and zero-copy +777c9873 2025-11-12 yangdx Optimize JSON write with fast/slow path to reduce memory usage +477c3f54 2025-11-12 Daniel.y Merge pull request #2345 from danielaskdd/remove-response-type +8c07c918 2025-11-12 yangdx Remove deprecated response_type parameter from query settings +69ca3662 2025-11-12 Daniel.y Merge pull request #2344 from danielaskdd/fix-josn-serialization-error +f28a0c25 2025-11-12 yangdx Improve JSON data sanitization to handle tuples and dict keys +6918a88f 2025-11-12 yangdx Add specialized JSON string sanitizer to prevent UTF-8 encoding errors +d1f4b6e5 2025-11-12 yangdx Add data sanitization to JSON writing to prevent UTF-8 encoding errors +1ffb5338 2025-11-11 yangdx Update env.example +5a6bb658 2025-11-11 Daniel.y Merge pull request #2338 from danielaskdd/migrate-to-pypdf +fdcb4d0b 2025-11-11 yangdx Replace PyPDF2 with pypdf for PDF processing +245df75d 2025-11-10 Tong Da easier version: detect chunking_func result is coroutine or not +e8f5f57e 2025-11-10 yangdx Update qdrant-client minimum version from 1.7.0 to 1.11.0 +913fa1e4 2025-11-09 yangdx Add concurrency warning for JsonKVStorage in cleanup tool +d137ba58 2025-11-09 Tong Da support async chunking func to improve processing performance when a heavy `chunking_func` is passed in by user +1f9d0735 2025-11-09 yangdx Bump API version to 0253 +3110ca51 2025-11-09 Daniel.y Merge pull request #2335 from danielaskdd/llm-cache-cleanup +37b71189 2025-11-09 yangdx Fix table alignment and add validation for empty cleanup selections +1485cb82 2025-11-09 yangdx Add LLM query cache cleanup tool for KV storage backends +8859eaad 2025-11-09 Daniel.y Merge pull request #2334 from danielaskdd/hotfix-opena-streaming +2f160652 2025-11-09 yangdx Refactor keyword_extraction from kwargs to explicit parameter +88ab73f6 2025-11-09 yangdx HotFix: Restore streaming response in OpenAI LLM +c12bc372 2025-11-09 yangdx Update README +7bc6ccea 2025-11-09 yangdx Add uv package manager support to installation docs +80f2e691 2025-11-09 yangdx Remove redundant i18n import triggered the Vite “dynamic + static import” warning +1334b3d8 2025-11-09 yangdx Update uv.lock +754d2ad2 2025-11-09 yangdx Add documentation for LLM cache migration between storage types +8adf3180 2025-11-09 Daniel.y Merge pull request #2330 from danielaskdd/llm-cache-migrate +a75efb06 2025-11-09 yangdx Fix: prevent source data corruption by target upsert function +987bc09c 2025-11-08 yangdx Update LLM cache migration docs and improve UX prompts +1a91bcdb 2025-11-08 yangdx Improve storage config validation and add config.ini fallback support +57ee7d5a 2025-11-08 yangdx Merge branch 'main' into llm-cache-migrate +85bb98b3 2025-11-08 Daniel.y Merge pull request #2331 from danielaskdd/gemini-retry +3d9de5ed 2025-11-08 yangdx feat: improve Gemini client error handling and retry logic +1864b282 2025-11-08 yangdx Add colored output formatting to migration confirmation display +e95b02fb 2025-11-08 yangdx Refactor storage selection UI with dynamic numbering and inline prompts +b72632e4 2025-11-08 yangdx Add async generator lock management rule to cline extension +5be04263 2025-11-08 yangdx Fix deadlock in JSON cache migration and prevent same storage selection +6b9f13c7 2025-11-08 yangdx Enhance LLM cache migration tool with streaming and improved UX +d0d31e92 2025-11-08 yangdx Improve LLM cache migration tool configuration and messaging +6fc54d36 2025-11-08 yangdx Move LLM cache migration tool to lightrag.tools module +0f2c0de8 2025-11-08 yangdx Fix linting +55274dde 2025-11-08 yangdx Add LLM cache migration tool for KV storage backends +cf732dbf 2025-11-08 yangdx Bump core version to 1.4.9.9 and API to 0252 +29a349f2 2025-11-08 Daniel.y Merge pull request #2329 from danielaskdd/gemini-embedding +a624a950 2025-11-08 yangdx Add Gemini to APIs requiring embedding dimension parameter +de4ed736 2025-11-08 yangdx Add Gemini embedding support +f4492d48 2025-11-08 Daniel.y Merge pull request #2328 from HKUDS/apply-dim-to-embedding-call +f83ea339 2025-11-08 yangdx Add section header comment for Gemini binding options +0b2a15c4 2025-11-08 yangdx Centralize embedding_send_dim config through args instead of env var +03cc6262 2025-11-08 yangdx Prohibit direct access to internal functions of EmbeddingFunc. +ffeeae42 2025-11-07 yangdx refactor: simplify jina embedding dimension handling +9cee5a63 2025-11-07 yangdx Merge branch 'main' into apply-dim-to-embedding-call +01b07b2b 2025-11-07 yangdx Refactor Jina embedding dimension by changing param to optional with default +d5362573 2025-11-07 Daniel.y Merge pull request #2327 from huangbhan/patch-1 +d95efcb9 2025-11-07 yangdx Fix linting +ce28f30c 2025-11-07 yangdx Add embedding_dim parameter support to embedding functions +c14f25b7 2025-11-07 yangdx Add mandatory dimension parameter handling for Jina API compliance +d8a6355e 2025-11-07 yangdx Merge branch 'main' into apply-dim-to-embedding-call +33a1482f 2025-11-07 yangdx Add optional embedding dimension parameter control via env var +5c0ced6e 2025-11-07 domices Fix spelling errors in the "使用PostgreSQL存储" section of README-zh.md +73284623 2025-11-07 Daniel.y Merge pull request #2326 from danielaskdd/gemini-cot +c580874a 2025-11-07 yangdx Remove depreced sample code +924c8cb8 2025-11-07 yangdx Merge branch 'main' into gemini-cot +fc40a369 2025-11-07 yangdx Add timeout support to Gemini LLM and improve parameter handling +3cb4eae4 2025-11-07 yangdx Add Chain of Thought support to Gemini LLM integration +6686edfd 2025-11-07 yangdx Update Gemini LLM options: add seed and thinking config, remove MIME type +d94aae9c 2025-11-07 Yasiru Rangana Add dimensions parameter support to openai_embed() +8c275553 2025-11-07 yangdx Fix Gemini response parsing to avoid warnings from non-text parts +366a1e0f 2025-11-07 Daniel.y Merge pull request #2322 from danielaskdd/fix-delete +ea141e27 2025-11-07 yangdx Fix: Remove redundant entity/relation chunk deletions +5bcd2926 2025-11-06 yangdx Bump API version to 0251 +9d0012b0 2025-11-06 Daniel.y Merge pull request #2321 from danielaskdd/fix-doc-del-slow +04ed709b 2025-11-06 yangdx Optimize entity deletion by batching edge queries to avoid N+1 problem +678e17bb 2025-11-06 yangdx Revert "fix(ui): Remove dynamic import for i18n in settings store" +f6a0ea3a 2025-11-06 Daniel.y Merge pull request #2320 from danielaskdd/fix-postgres +3276b7a4 2025-11-06 yangdx Fix linting +155f5975 2025-11-06 yangdx Fix node ID normalization and improve batch operation consistency +edf48d79 2025-11-06 Daniel.y Merge pull request #2319 from danielaskdd/remove-deprecated-code +36501b82 2025-11-06 yangdx Initialize shared storage for all graph storage types in graph unit test +0c47d1a2 2025-11-06 yangdx Fix linting +f3b2ba81 2025-11-06 yangdx Translate graph storage test from Chinese to English +a790f081 2025-11-06 yangdx Refine gitignore to only exclude root-level test files +6b0f9795 2025-11-06 yangdx Add workspace parameter and remove chunk-based query unit tests +807d2461 2025-11-06 yangdx Remove unused chunk-based node/edge retrieval methods +831e658e 2025-11-06 yangdx Update readme +0216325e 2025-11-06 yangdx fix(ui): Remove dynamic import for i18n in settings store +6e36ff41 2025-11-06 yangdx Fix linting +775933aa 2025-11-06 yangdx Merge branch 'VOXWAVE-FOUNDRY/main' +5f49cee2 2025-11-06 yangdx Merge branch 'main' into VOXWAVE-FOUNDRY/main +b0d44d28 2025-11-06 yangdx Add Langfuse observability integration documentation +bd62bb30 2025-11-05 Daniel.y Merge pull request #2314 from danielaskdd/ragas +9c057060 2025-11-05 yangdx Add separate endpoint configuration for LLM and embeddings in evaluation +994a82dc 2025-11-05 yangdx Suppress token usage warnings for custom OpenAI-compatible endpoints +d803df94 2025-11-05 yangdx Fix linting +451257ae 2025-11-05 yangdx Doc: Update news with recent features +f490622b 2025-11-05 yangdx Doc: Refactor evaluation README to improve clarity and structure +a73314a4 2025-11-05 yangdx Refactor evaluation results display and logging format +06b91d00 2025-11-05 yangdx Improve RAG evaluation progress eval index display with zero padding +eb80771f 2025-11-05 Daniel.y Merge pull request #2311 from danielaskdd/evalueate-cli +2823f92f 2025-11-05 yangdx Fix tqdm progress bar conflicts in concurrent RAG evaluation +e5abe9dd 2025-11-05 yangdx Restructure semaphore control to manage entire evaluation pipeline +83715a3a 2025-11-05 yangdx Implement two-stage pipeline for RAG evaluation with separate semaphores +d36be1f4 2025-11-05 yangdx Improve RAGAS evaluation progress tracking and clean up output handling +c358f405 2025-11-04 yangdx Update evaluation defaults and expand sample dataset +41c26a36 2025-11-04 yangdx feat: add command-line args to RAG evaluation script +a618f837 2025-11-04 yangdx Merge branch 'new/ragas-evaluation' +d4b8a229 2025-11-04 yangdx Update RAGAS evaluation to use gpt-4o-mini and improve compatibility +6d61f70b 2025-11-04 yangdx Clean up RAG evaluator logging and remove excessive separator lines +4e4b8d7e 2025-11-04 yangdx Update RAG evaluation metrics to use class instances instead of objects +7abc6877 2025-11-04 yangdx Add comprehensive configuration and compatibility fixes for RAGAS +72db0426 2025-11-04 yangdx Update .env loading and add API authentication to RAG evaluator +ad2d3c2c 2025-11-03 anouarbm Merge remote-tracking branch 'origin/main' into feat/ragas-evaluation +2fdb5f5e 2025-11-03 anouarbm chore: trigger CI re-run 2 +36bffe22 2025-11-03 anouarbm chore: trigger CI re-run +debfa0ec 2025-11-03 anouarbm Merge branch 'feat/ragas-evaluation' of https://github.com/anouar-bm/LightRAG into feat/ragas-evaluation +a172cf89 2025-11-03 anouarbm feat(evaluation): Add sample documents for reproducible RAGAS testing +10f6e695 2025-11-03 yangdx Improve Langfuse integration and stream response cleanup handling +5da709b4 2025-11-03 ben moussa anouar Merge branch 'main' into feat/ragas-evaluation +36694eb9 2025-11-03 anouarbm fix(evaluation): Move import-time validation to runtime and improve documentation +6975e69e 2025-11-03 Daniel.y Merge pull request #2298 from anouar-bm/feat/langfuse-observability +e0966b65 2025-11-03 yangdx Add BuildKit cache mounts to optimize Docker build performance +9495778c 2025-11-03 anouarbm refactor: reorder Langfuse import logic for improved clarity +c9e1c6c1 2025-11-03 anouarbm fix(api): change content field to list in query responses +9d69e8d7 2025-11-03 anouarbm fix(api): Change content field from string to list in query responses +7b8223da 2025-11-03 yangdx Update env.example with host/endpoint clarifications for LLM/embedding +363f3051 2025-11-02 anouarbm eval using open ai +77db0803 2025-11-02 anouarbm Merge remote-tracking branch 'lightrag-fork/feat/ragas-evaluation' into feat/ragas-evaluation +0b5e3f9d 2025-11-02 anouarbm Use logger in RAG evaluation and optimize reference content joins +98f0464a 2025-11-02 ben moussa anouar Update lightrag/evaluation/eval_rag_quality.py for launguage +963ad4c6 2025-11-02 anouarbm docs: Add documentation and examples for include_chunk_content parameter +0bbef981 2025-11-02 anouarbm Optimize RAGAS evaluation with parallel execution and chunk content enrichment +026bca00 2025-11-02 anouarbm fix: Use actual retrieved contexts for RAGAS evaluation +b12b693a 2025-11-02 anouarbm fixed ruff format of csv path +5cdb4b0e 2025-11-02 anouarbm fix: Apply ruff formatting and rename test_dataset to sample_dataset +aa916f28 2025-11-01 anouarbm docs: add generic test_dataset.json for evaluation examples Test cases with generic examples about: - LightRAG framework features and capabilities - RAG system architecture and components - Vector database support (ChromaDB, Neo4j, Milvus, etc.) - LLM provider integrations (OpenAI, Anthropic, Ollama, etc.) - RAG evaluation metrics explanation - Deployment options (Docker, FastAPI, direct integration) - Knowledge graph-based retrieval concepts +626b42bc 2025-11-01 anouarbm feat: add optional Langfuse observability integration +1ad0bf82 2025-11-01 anouarbm feat: add RAGAS evaluation framework for RAG quality assessment +ece0398d 2025-11-01 Daniel.y Merge pull request #2296 from danielaskdd/pdf-decryption +61b57cbb 2025-11-01 yangdx Add PDF decryption support for password-protected files +728721b1 2025-11-01 yangdx Remove redundant separator lines in gunicorn shutdown handler +6d4a5510 2025-11-01 yangdx Remove redundant shutdown message from gunicorn +bc8a8842 2025-11-01 Daniel.y Merge pull request #2295 from danielaskdd/mix-query-without-kg +ec2ea4fd 2025-11-01 yangdx Rename function and variables for clarity in context building +9a8742da 2025-10-31 yangdx Improve entity merge logging by removing redundant message and fixing typo +6b4514c8 2025-10-31 yangdx Reduce logging verbosity in entity merge relation processing +2496d871 2025-10-31 yangdx Add data/ directory to .gitignore +7ccc1fdd 2025-10-31 yangdx Add frontend rebuild warning indicator to version display +e5414c61 2025-10-31 yangdx Bump core version to 1.4.9.8 and API version to 0250 +08b0283b 2025-10-31 Daniel.y Merge pull request #2291 from danielaskdd/reload-popular-labels +58c83f9d 2025-10-31 yangdx Add auto-refresh of popular labels when pipeline completes +94cdbe77 2025-10-31 Daniel.y Merge pull request #2290 from danielaskdd/delete-residual-edges +afb5e5c1 2025-10-31 yangdx Fix edge cleanup when deleting entities to prevent orphaned relationships +3b48cf16 2025-10-31 Daniel.y Merge pull request #2289 from danielaskdd/fix-pycrptodome-missing +c46c1b26 2025-10-31 yangdx Add pycryptodome dependency for PDF encryption support +bda52a87 2025-10-31 Daniel.y Merge pull request #2287 from danielaskdd/fix-ui +71b27ec4 2025-10-31 yangdx Optimize property edit dialog to use trimmed value consistently +4cbd8761 2025-10-31 yangdx feat: Update node color and legent after entity_type changed +79a17c3f 2025-10-30 yangdx Fix graph value handling for entity_id updates +c36afecb 2025-10-30 yangdx Remove redundant await call in file extraction pipeline +c9e73bb4 2025-10-30 yangdx Bump core version to 1.4.9.7 and API version to 0249 +042cbad0 2025-10-30 yangdx Merge branch 'qdrant-multi-tenancy' +5f4a2804 2025-10-30 yangdx Add Qdrant legacy collection migration with workspace support +0498e80a 2025-10-30 yangdx Merge branch 'main' into qdrant-multi-tenancy +78ccc4f6 2025-10-30 yangdx Refactor .gitignore +783e2f3b 2025-10-30 yangdx Update uv.lock +f610fdaf 2025-10-30 yangdx Merge branch 'main' into Anush008/main +8145201d 2025-10-30 Daniel.y Merge pull request #2284 from danielaskdd/fix-static-missiing +16d3d82a 2025-10-30 yangdx Include static files in package distribution +8af8bd80 2025-10-29 yangdx docs: add frontend build steps to server installation guide +0fa2fc9c 2025-10-29 yangdx Refactor systemd service config to use environment variables +6dc027cb 2025-10-29 yangdx Merge branch 'fix-exit-handler' +a1cf01dc 2025-10-29 Daniel.y Merge pull request #2280 from danielaskdd/fix-exit-handler +c5ad9982 2025-10-29 Daniel.y Merge pull request #2281 from danielaskdd/restore-query-example +14a015d4 2025-10-29 yangdx Restore query generation example and fix README path reference +3a7f7535 2025-10-29 yangdx Bump core version to 1.4.9.6 and API version to 0248 +d5bcd14c 2025-10-29 yangdx Refactor service deployment to use direct process execution +6489aaa7 2025-10-29 yangdx Remove worker_exit hook and improve cleanup logging +4a46d39c 2025-10-29 yangdx Replace GUNICORN_CMD_ARGS with custom LIGHTRAG_GUNICORN_MODE flag +816feefd 2025-10-29 yangdx Fix cleanup coordination between Gunicorn and UvicornWorker lifecycles +72b29659 2025-10-29 yangdx Fix worker process cleanup to prevent shared resource conflicts +0692175c 2025-10-29 yangdx Remove enable_logging parameter from get_data_init_lock call in MilvusVectorDBStorage +ec797276 2025-10-29 Daniel.y Merge pull request #2279 from danielaskdd/fix-edge-merge-stage +ee7c683f 2025-10-29 yangdx Fix swagger docs page problem in dev mode +54c48dce 2025-10-29 yangdx Fix z-index layering for GraphViewer UI panels +da2e9efd 2025-10-29 yangdx Bump API version to 0247 +3fa79026 2025-10-29 yangdx Fix Entity Source IDs Tracking Problem +29c4a91d 2025-10-28 yangdx Move relationship ID sorting to before vector DB operations +c81a56a1 2025-10-28 yangdx Fix entity and relationship deletion when no chunk references remain +4bf41abe 2025-10-28 Daniel.y Merge pull request #2272 from HKUDS/dependabot/pip/redis-gte-5.0.0-and-lt-8.0.0 +d0be68c8 2025-10-28 Daniel.y Merge pull request #2273 from danielaskdd/static-docs +af6aff33 2025-10-28 Daniel.y Merge pull request #2266 from danielaskdd/merge-entity +f81dd4e7 2025-10-27 dependabot[bot] Update redis requirement from <7.0.0,>=5.0.0 to >=5.0.0,<8.0.0 +88d12bea 2025-10-28 yangdx Add offline Swagger UI support with custom static file serving +b32b2e8b 2025-10-28 yangdx Refactor merge dialog and improve search history sync +ea006bd3 2025-10-28 yangdx Fix entity update logic to handle renaming operations +5155edd8 2025-10-27 yangdx feat: Improve entity merge and edit UX +97034f06 2025-10-27 yangdx Add allow_merge parameter to entity update API endpoint +11a1631d 2025-10-27 yangdx Refactor entity edit and merge functions to support merge-on-rename +411e92e6 2025-10-27 yangdx Fix vector deletion logging to show actual deleted count +94f24a66 2025-10-27 yangdx Bump API version to 0246 +25f829ef 2025-10-27 yangdx Enable editing of entity_type field in node properties +8dfd3bf4 2025-10-27 yangdx Replace global graph DB lock with fine-grained keyed locking +2c09adb8 2025-10-27 yangdx Add chunk tracking support to entity merge functionality +a25003c3 2025-10-27 yangdx Fix relation deduplication logic and standardize log message prefixes +ab32456a 2025-10-27 yangdx Refactor entity merging with unified attribute merge function +38559373 2025-10-26 yangdx Fix entity merging to include target entity relationships +69b4cda2 2025-10-26 Daniel.y Merge pull request #2265 from danielaskdd/edit-kg-new +6015e8bc 2025-10-26 yangdx Refactor graph utils to use unified persistence callback +a3370b02 2025-10-26 yangdx Add chunk tracking cleanup to entity/relation deletion and creation +bf1897a6 2025-10-26 yangdx Normalize entity order for undirected graph consistency +3fbd704b 2025-10-26 yangdx Enhance entity/relation editing with chunk tracking synchronization +8584980e 2025-10-26 Anush008 refactor: Qdrant Multi-tenancy (Include staged) +11f1f366 2025-10-25 Daniel.y Merge pull request #2262 from danielaskdd/sort-edge +3ad4f12f 2025-10-25 Daniel.y Merge pull request #2259 from danielaskdd/data-migration-problem +29bf5936 2025-10-25 yangdx Fix entity and relation chunk cleanup in deletion pipeline +5ee9a2f8 2025-10-25 yangdx Fix entity consistency in knowledge graph rebuilding and merging +a97e5dad 2025-10-25 yangdx Optimize PostgreSQL graph queries to avoid Cypher overhead and complexity +a9bc3484 2025-10-25 yangdx Remove enable_logging parameter from data init lock call +c82485d9 2025-10-25 Daniel.y Merge pull request #2253 from Mobious/main +97a2ee4e 2025-10-25 yangdx Rename rebuild function name and improve relationship logging format +083b163c 2025-10-25 yangdx Improve lock logging with consistent messaging and debug levels +e2ec1cdc 2025-10-25 Daniel.y Merge pull request #2258 from danielaskdd/pipeline-cancelllation +3eb3a075 2025-10-25 yangdx Bump core version to 1.4.9.5 and API version to 0245 +9ed19695 2025-10-25 yangdx Remove separate retry button and merge functionality into scan button +81e3496a 2025-10-25 yangdx Add confirmation dialog for pipeline cancellation +2476d6b7 2025-10-25 yangdx Simplify pipeline status dialog by consolidating message sections +a9ec15e6 2025-10-25 yangdx Resolve lock leakage issue during user cancellation handling +77336e50 2025-10-24 yangdx Improve error handling and add cancellation checks in pipeline +f89b5ab1 2025-10-24 yangdx Add pipeline cancellation feature with UI and i18n support +78ad8873 2025-10-24 yangdx Add cancellation check in delete loop +743aefc6 2025-10-24 yangdx Add pipeline cancellation feature for graceful processing termination +f24a2616 2025-10-23 Mobious Allow users to provide keywords with QueryRequest +6a29b5da 2025-10-23 yangdx Update Docker deployment comments for LLM and embedding hosts +fdf0fe04 2025-10-22 yangdx Bump API version to 0244 +0fa9a2ee 2025-10-22 yangdx Fix dimension type comparison in Milvus vector field validation +06533fdb 2025-10-22 Daniel.y Merge pull request #2248 from danielaskdd/preprocess-rayanything +8dc23eef 2025-10-22 yangdx Fix RayAnything compatible problem +00aa5e53 2025-10-22 yangdx Improve entity identifier truncation warning message format +cf2174b9 2025-10-22 Daniel.y Merge pull request #2245 from danielaskdd/entity-name-len +c92ab837 2025-10-22 yangdx Fix linting +3ba1d75c 2025-10-22 Daniel.y Merge pull request #2243 from xiaojunxiang2023/main +904b1f46 2025-10-22 yangdx Add entity name length truncation with configurable limit +20edd329 2025-10-22 Daniel.y Merge pull request #2244 from danielaskdd/del-doc-cache +b76350a3 2025-10-22 yangdx Fix linting +d7e2527e 2025-10-22 yangdx Handle cache deletion errors gracefully instead of raising exceptions +1101562e 2025-10-22 yangdx Bump API version to 0243 +162370b6 2025-10-22 yangdx Add optional LLM cache deletion when deleting documents +d392db7b 2025-10-22 Daniel.y Fix typo in 'equipment' in prompt.py +04d9fe02 2025-10-22 xiaojunxiang Merge branch 'HKUDS:main' into main +9e5004e2 2025-10-22 xiaojunxiang fix(docs): correct typo "acivate" → "activate" +90720471 2025-10-21 Daniel.y Merge pull request #2237 from yrangana/feat/optimize-postgres-initialization +ef4acf53 2025-10-21 dependabot[bot] Update pandas requirement from <2.3.0,>=2.0.0 to >=2.0.0,<2.4.0 +175ef459 2025-10-21 Daniel.y Merge pull request #2238 from HKUDS/dependabot/pip/openai-gte-1.0.0-and-lt-3.0.0 +aee0afdd 2025-10-21 Daniel.y Merge pull request #2240 from danielaskdd/limit-vdb-metadata-size +a809245a 2025-10-21 yangdx Preserve file path order by using lists instead of sets +fe890fca 2025-10-21 yangdx Improve formatting of limit method info in rebuild functions +88a45523 2025-10-21 yangdx Increase default max file paths from 30 to 100 and improve documentation +e5e16b7b 2025-10-21 yangdx Fix Redis data migration error +3ed2abd8 2025-10-21 yangdx Improve logging to show source ID ratios when skipping entities/edges +3ad616be 2025-10-21 yangdx Change default source IDs limit method from KEEP to FIFO +80668aae 2025-10-21 yangdx Improve file path truncation labels and UI consistency +be3d274a 2025-10-21 yangdx Refactor node and edge merging logic with improved code structure +a5253244 2025-10-21 yangdx Simplify skip logging and reduce pipeline status updates +1248b3ab 2025-10-21 yangdx Increase default limits for source IDs and file paths in metadata +cd1c48be 2025-10-21 yangdx Standardize placeholder format to use colon separator consistently +019dff52 2025-10-21 yangdx Update truncation message format in properties tooltip +1154c568 2025-10-21 yangdx Refactor deduplication calculation and remove unused variables +665f60b9 2025-10-21 yangdx Refactor entity/relation merge to consolidate VDB operations within functions +74694214 2025-10-20 dependabot[bot] Update openai requirement from <2.0.0,>=1.0.0 to >=1.0.0,<3.0.0 +e01c998e 2025-10-20 yangdx Track placeholders in file paths for accurate source count display +637b850e 2025-10-20 yangdx Add truncation indicator and update property labels in graph view +2f22336a 2025-10-21 Yasiru Rangana Optimize PostgreSQL initialization performance +e0fd31a6 2025-10-20 yangdx Fix logging message formatting +a9fec267 2025-10-20 yangdx Add file path limit configuration for entities and relations +0b3d3150 2025-10-20 Humphry extended to use gemini, sswitched to use gemini-flash-latest +dc62c78f 2025-10-20 yangdx Add entity/relation chunk tracking with configurable source ID limits +bdadaa67 2025-10-18 yangdx Merge branch 'main' into limit-vdb-metadata-size +c0f69395 2025-10-18 yangdx Merge branch 'security/fix-sql-injection-postgres' +813f4af9 2025-10-18 yangdx Fix linting +012aaada 2025-10-18 yangdx Update Swagger API key status description text +917e41aa 2025-10-17 Lucky Verma Refactor SQL queries and improve input handling in PGKVStorage and PGDocStatusStorage +03333d63 2025-10-17 yangdx Merge branch 'main' into limit-vdb-metadata-size +6b37d3ca 2025-10-17 yangdx Merge branch 'feat-entity-size-caps' into limit-vdb-metadata-size +8070d0cf 2025-10-17 Daniel.y Merge pull request #2234 from danielaskdd/fix-webui +7bf9d1e8 2025-10-17 yangdx Bump API version to 0241 +dab1c358 2025-10-17 yangdx Optimize chat performance by reducing animations in inactive tabs +04d23671 2025-10-17 yangdx Fix redoc access problem in front-end dev mode +4c3ab584 2025-10-17 yangdx Improve AsyncSelect layout and text overflow handling +f5558240 2025-10-17 yangdx Fix tuple delimiter corruption handling in regex patterns +17c2a929 2025-10-15 DivinesLight Get max source Id config from .env and lightRAG init +4e740af7 2025-10-14 haseebuchiha Import from env and use default if none and removed useless import +7871600d 2025-10-14 DivinesLight Quick fix to limit source_id ballooning while inserting nodes +46ac5dac 2025-10-17 yangdx Improve API description formatting and add ReDoc link +9f49e56a 2025-10-17 yangdx Merge branch 'main' into feat-entity-size-caps +c0a87ca7 2025-10-17 yangdx Merge branch 'remove-dotenv' +1642710f 2025-10-17 yangdx Remove dotenv dependency from project +06ed2d06 2025-10-17 yangdx Merge branch 'main' into remove-dotenv +c18762e3 2025-10-17 yangdx Simplify Docker deployment documentation and improve clarity +f45dce34 2025-10-17 yangdx Fix cache control error of index.html +53240041 2025-10-17 Won-Kyu Park remove deprecated dotenv package. +35cd567c 2025-10-17 yangdx Allow related chunks missing in knowledge graph queries +0e0b4a94 2025-10-16 yangdx Improve Docker build workflow with automated multi-arch script and docs +efd50064 2025-10-16 yangdx docs: improve Docker build documentation with clearer notes +daeca17f 2025-10-16 yangdx Change default docker image to offline version +c61b7bd4 2025-10-16 yangdx Remove torch and transformers from offline dependency groups +8e3497dc 2025-10-16 yangdx Update comments +ef6ed429 2025-10-16 yangdx Optimize Docker builds with layer caching and add pip for runtime installs +8cc8bbf4 2025-10-16 yangdx Change Docker build cache mode from max to min +e6332ce5 2025-10-16 yangdx Add reminder note to manual Docker build workflow +91b8722b 2025-10-16 yangdx Bump core version to 1.4.9.4 +388dce2e 2025-10-16 yangdx docs: clarify docling exclusion in offline Docker image +f2b6a068 2025-10-16 yangdx Remove docling dependency and related packages from project +ef79821f 2025-10-16 yangdx Add build script for multi-platform images +65c2eb9f 2025-10-16 yangdx Migrate Dockerfile from pip to uv package manager for faster builds +466de207 2025-10-16 yangdx Migrate from pip to uv package manager for faster builds +7f223d5a 2025-10-15 yangdx Fix linting +433ec813 2025-10-15 yangdx Improve offline installation with constraints and version bounds +c06522b9 2025-10-15 DivinesLight Get max source Id config from .env and lightRAG init +1fd02b18 2025-10-15 Daniel.y Merge pull request #2222 from danielaskdd/offline-docker-image +19c05f9e 2025-10-15 yangdx Add static 'offline' tag to Docker image metadata +6d1ae404 2025-10-15 yangdx Add offline Docker build support with embedded models and cache +83b10a52 2025-10-15 Daniel.y Merge pull request #2218 from danielaskdd/issue-2215 +29bac49f 2025-10-15 yangdx Handle empty query results by returning None instead of fail responses +d52c3377 2025-10-14 haseebuchiha Import from env and use default if none and removed useless import +54f0a7d1 2025-10-14 DivinesLight Quick fix to limit source_id ballooning while inserting nodes +965d8b16 2025-10-14 yangdx Merge branch 'add-preprocessed-status' +e5cbc593 2025-10-14 yangdx Optimize Docker build with multi-stage frontend compilation +92a66565 2025-10-14 Daniel.y Merge pull request #2211 from HKUDS/add-preprocessed-status +a81c122f 2025-10-14 yangdx Bump API version to 0240 +130b4959 2025-10-14 yangdx Add PREPROCESSED (multimodal_processed) status for multimodal document processing +64900b54 2025-10-14 yangdx Add frontend source code update warning +5ace200b 2025-10-14 Daniel.y Merge pull request #2208 from danielaskdd/remove-fontend-artifact +f3740d82 2025-10-14 yangdx Bump API version to 0239 +df52ce98 2025-10-14 yangdx Revert vite.config.ts to origin version +a8bbce3a 2025-10-14 yangdx Use frozen lockfile for consistent frontend builds +c0b1552e 2025-10-14 yangdx Remove .gitkeep file by ensuring webui dir exists on bun build +50210e25 2025-10-14 yangdx Add @tailwindcss/typography plugin and fix Tailwind config +8bf41131 2025-10-14 yangdx Standardize build commands and remove --emptyOutDir flag +ee45ab51 2025-10-14 yangdx Move frontend build check from setup.py to runtime server startup +6c05f0f8 2025-10-13 yangdx Fix linting +be9e6d16 2025-10-13 yangdx Exclude Frontend Build Artifacts from Git Repository +cc436910 2025-10-13 yangdx Merge branch 'kevinnkansah/main' +a93c1661 2025-10-13 yangdx Fix list formatting in README installation steps +4fcae985 2025-10-12 Kevin Update README.md +074f0c8b 2025-10-12 yangdx Update docstring for adelete_by_doc_id method clarity +8a009899 2025-10-12 yangdx Update webui assets +5290b60e 2025-10-12 Daniel.y Merge pull request #2196 from zl7261/main +5734f51e 2025-10-12 Daniel.y Merge pull request #2198 from danielaskdd/webui +79f623a2 2025-10-12 yangdx Bump core version to 1.4.9.3 +e1af1c6d 2025-10-12 yangdx Update webui assets +8eb0f83e 2025-10-12 yangdx Simplify Vite build config by removing manual chunking strategy +f402ad27 2025-10-12 yangdx Bump API version to 0238 +f2fb1202 2025-10-12 yangdx Move accordion keyframes from CSS to Tailwind config and add fallback 'auto' value +44f51f88 2025-10-12 yangdx Add fallback value for accordion content height CSS variable +2d9334d3 2025-10-12 yangdx Simplify Root component by removing async i18n initialization +bc1a70ba 2025-10-11 yangdx Remove explicit protobuf dependency from offline storage requirements +baab9924 2025-10-11 yangdx Update pymilvus dependency from 2.5.2 to >=2.6.2 +1a4d6775 2025-10-11 杨广 i18n: fix mustache brackets +fbcc35bb 2025-10-11 yangdx Merge branch 'hotfix-postgres' +289337b2 2025-10-11 yangdx Bump API version to 0237 +766f27da 2025-10-11 Daniel.y Merge pull request #2193 from kevinnkansah/main +82397834 2025-10-11 Daniel.y Merge pull request #2195 from danielaskdd/hotfix-postgres +e1e4f1b0 2025-10-11 yangdx Fix get_by_ids to return None for missing records consistently +b7216ede 2025-10-11 yangdx Fix linting +49197fbf 2025-10-11 yangdx Update pymilvus to >=2.6.2 and add protobuf compatibility constraint +7cddd564 2025-10-11 yangdx Revert core version to 1.4.9..2 +9be22dd6 2025-10-11 yangdx Preserve ordering in get_by_ids methods across all storage implementations +49326f2b 2025-10-11 Daniel.y Merge pull request #2194 from danielaskdd/offline +a5c05f1b 2025-10-11 yangdx Add offline deployment support with cache management and layered deps +b81b8620 2025-10-10 kevinnkansah chore: update deps +fea10cd0 2025-10-10 yangdx Merge branch 'chart-enchancment' +648d7bb1 2025-10-10 yangdx Refactor Helm template to handle optional envFrom values safely +8d3b53ce 2025-10-10 yangdx Condensed AGENTS.md to focus on essential development guidelines +12facac5 2025-10-10 yangdx Enhance graph API endpoints with detailed docs and field validation +85d1a563 2025-10-10 yangdx Merge branch 'adminunblinded/main' +1bf802ee 2025-10-10 yangdx Add AGENTS.md documentation section for AI coding agent guidance +d0ae7a67 2025-10-10 yangdx Fix typos and grammar in env.example configuration comments +6e39c0c0 2025-10-10 yangdx Rename Agments.md to AGENTS.md and standardize formatting +b7c77396 2025-10-09 NeelM0906 Fix entity/relation creation endpoints to properly update vector stores +f6d1fb98 2025-10-09 NeelM0906 Fix Linting errors +b4d61eb8 2025-10-10 Daniel.y Merge pull request #2192 from danielaskdd/postgres-network-retry +b3ed2647 2025-10-10 yangdx Refactor PostgreSQL retry config to use centralized configuration +bd535e3e 2025-10-10 yangdx Add PostgreSQL connection retry configuration options +e758204a 2025-10-10 yangdx Add PostgreSQL connection retry mechanism with comprehensive error handling +577b9e68 2025-10-09 yangdx Add project intelligence files for AI agent collaboration +0f15fdc3 2025-10-09 Daniel.y Merge pull request #2181 from yrangana/feat/openai-embedding-token-tracking +ae9f4ae7 2025-10-09 Yasiru Rangana fix: Remove trailing whitespace for pre-commit linting +9f44e89d 2025-10-08 NeelM0906 Add knowledge graph manipulation endpoints +ec40b17e 2025-10-08 Yasiru Rangana feat: Add token tracking support to openai_embed function +f1e01107 2025-10-07 yangdx Merge branch 'kevinnkansah/main' +f2c0b41e 2025-10-07 yangdx Make PostgreSQL statement_cache_size configuration optional +ea5e390b 2025-10-07 Daniel.y Merge pull request #2178 from aleksvujic/patch-1 +dd8f44e6 2025-10-07 Aleks Vujić Fixed typo in log message when creating new graph file +119d2fa1 2025-10-06 Tomek Cyran Adding support for imagePullSecrets, envFrom, and deployment strategy in Helm chart +fdcb034d 2025-10-06 kevinnkansah chore: distinguish settings +22a7b482 2025-10-06 kevinnkansah fix: renamed PostGreSQL options env variable and allowed LRU cache to be an optional env variable +d8a9617c 2025-10-06 kevinnkansah fix: fix: asyncpg bouncer connection pool error +108cdbe1 2025-10-05 kevinnkansah feat: add options for PostGres connection +6190fa89 2025-10-06 yangdx Fix linting +91387628 2025-10-06 yangdx Add test script for aquery_data endpoint validation +4fe41f76 2025-10-05 yangdx Merge branch 'doc-name-in-full-docs' +d473f635 2025-10-05 yangdx Update webui assets +a31192dd 2025-10-05 yangdx Update i18n file for pipeline UI text across locales +aac787ba 2025-10-05 yangdx Clarify chunk tracking log message in _build_llm_context +1b274706 2025-10-05 Daniel.y Merge pull request #2171 from danielaskdd/doc-name-in-full-docs +457d5195 2025-10-05 yangdx Add doc_name field to full docs storage +d550f1c5 2025-10-05 yangdx Fix linting +1574fec7 2025-10-05 yangdx Update webui assets +0aef6a16 2025-10-05 yangdx Add theme-aware edge highlighting colors for graph control +dad90d25 2025-10-05 Daniel.y Merge pull request #2170 from danielaskdd/tooltips-optimize +0c1cb7b7 2025-10-05 yangdx Improve document tooltip display with track ID and better formatting +b5f83767 2025-10-05 yangdx Update webui assets +dde728a3 2025-10-05 yangdx Bump core version to 1.5.0 and API to 0236 +0d694962 2025-10-05 yangdx Merge branch 'feat/retry-failed-documents-upstream' +7b1f8e0f 2025-10-05 yangdx Update scan tooltip to clarify it also reprocesses failed documents +bf6ca9dd 2025-10-05 yangdx Add retry failed button translations and standardize button text +cf2a024e 2025-10-04 Jon feat: Add endpoint and UI to retry failed documents +b9c37bd9 2025-10-03 yangdx Fix linting +112349ed 2025-10-02 yangdx Modernize type hints and remove Python 3.8 compatibility code +cec784f6 2025-10-02 yangdx Update webui assets +181525ff 2025-10-02 yangdx Merge branch 'main' into zl7261/main +19a41584 2025-10-02 yangdx Fix linting +a250d881 2025-10-02 yangdx Update webui assets +f6b71536 2025-10-02 yangdx Merge branch 'main' into fix/dark-mode-graph-text-colors +b1a4e7d7 2025-10-02 yangdx Fix linting +d4abe704 2025-10-02 yangdx Hide dev options in production builds +1f07d4b1 2025-10-02 yangdx Remove .env_example from .gitignore +d2196a4e 2025-10-02 yangdx Merge remote-tracking branch 'origin/main' +1bd84f00 2025-10-01 Roman Marchuk Merge branch 'main' into fix/dark-mode-graph-text-colors +7297ca1d 2025-10-01 Roman Marchuk Fix dark mode graph labels for system theme and improve colors +6bf6f43d 2025-10-02 yangdx Remove bold formatting from instruction headers in prompts +bb6138e7 2025-10-01 yangdx fix(prompt): Clarify reference section restrictions in prompt templates +37e8898c 2025-10-01 yangdx Simplify reference formatting in LLM context generation +f83cde14 2025-10-01 yangdx fix(prompt): Improve markdown formatting requirements and reference style +83d99e14 2025-10-01 yangdx fix(OllamaAPI): Add validation to ensure last message is from user role +ffcd75a4 2025-09-29 zl7261 decalre targetNode after check sourceNode +6a8de2ed 2025-09-29 zl7261 web_ui: check node source and target \ No newline at end of file diff --git a/logs/2025-12-04-00-51-beastmode-chatmode-log.md b/logs/2025-12-04-00-51-beastmode-chatmode-log.md new file mode 100644 index 00000000..ffb44298 --- /dev/null +++ b/logs/2025-12-04-00-51-beastmode-chatmode-log.md @@ -0,0 +1,17 @@ +Actions: +- Pulled upstream (HKUDS/LightRAG) and diffed HEAD vs upstream/main +- Inspected and documented major changes (multi-tenant support, security hardening, RLS, RBAC, config defaults) +- Created concise docs under docs/diff_hku: index.md, summary.md, technical_diffs.md, security_audit.md, migration_guide.md, tests_needed.md + +Decisions: +- Focused on security, DB migrations, and runtime wiring as top priorities +- Kept documents concise but dense for engineering and DevOps audiences + +Next steps: +- Add DB migrations and instrument DB session setter for RLS +- Implement CI tests for RLS + tenant isolation and permission matrix +- Run e2e tests under staging Postgres before production rollout + +Lessons / insights: +- Multi-tenant changes are substantive — require DB migrations + end-to-end tests to avoid silent data leakage +- Default secrets and env defaults are currently unsafe for production; rotate and require via env validation diff --git a/logs/2025-12-04-00-54-beastmode-chatmode-log.md b/logs/2025-12-04-00-54-beastmode-chatmode-log.md new file mode 100644 index 00000000..d1fd4a19 --- /dev/null +++ b/logs/2025-12-04-00-54-beastmode-chatmode-log.md @@ -0,0 +1,14 @@ +Actions: +- Enumerated commits present only in upstream/main and created a grouped audit of features/fixes not merged locally. +- Wrote docs/diff_hku/unmerged_upstream.md (summary) and docs/diff_hku/unmerged_upstream_commits.txt (raw commit list) with 746 commits. + +Decisions/assumptions: +- Focused the summary on substantive features, storage, LLM/embedding changes, chunking, docs, tooling, and tests. Dependency bumps were not enumerated individually in the summary but are included in raw commits. + +Next steps: +- Optionally cherry-pick or rebase high-priority upstream fixes into the local branch (recommend chunking, embedding, doc parsing, Postgres/vchordrq and RLS compatibility fixes first). +- Add CI jobs in this repo to run the newly added upstream tests and candidate merge validation. + +File outputs: +- docs/diff_hku/unmerged_upstream.md +- docs/diff_hku/unmerged_upstream_commits.txt