LightRAG

Author	SHA1	Message	Date
clssck	abb44eccb1	feat(lightrag): improve entity extraction prompts and rerank chunking Enhance entity extraction with better structured prompts: - Reorganize prompt format for improved clarity and consistency - Add XML-style formatting tags for better LLM parsing - Include language parameter in keywords extraction cache key - Fix language parameter usage in keywords_extraction prompt Improve rerank module with chunking fixes: - Fix top_n behavior to limit documents instead of chunks - Add Cohere reranker support with proper chunking - Improve error handling for rerank API responses Update operate.py: - Better entity extraction parsing and validation - Improved cache key generation for multilingual support	2025-12-12 16:45:14 +01:00
clssck	59e89772de	refactor: consolidate to PostgreSQL-only backend and modernize stack Remove legacy storage implementations and deprecated examples: - Delete FAISS, JSON, Memgraph, Milvus, MongoDB, Nano Vector DB, Neo4j, NetworkX, Qdrant, Redis storage backends - Remove Kubernetes deployment manifests and installation scripts - Delete unofficial examples for deprecated backends and offline deployment docs Streamline core infrastructure: - Consolidate storage layer to PostgreSQL-only implementation - Add full-text search caching with FTS cache module - Implement metrics collection and monitoring pipeline - Add explain and metrics API routes Modernize frontend and tooling: - Switch web UI to Bun with bun.lock, remove npm and pnpm lockfiles - Update Dockerfile for PostgreSQL-only deployment - Add Makefile for common development tasks - Update environment and configuration examples Enhance evaluation and testing capabilities: - Add prompt optimization with DSPy and auto-tuning - Implement ground truth regeneration and variant testing - Add prompt debugging and response comparison utilities - Expand test coverage with new integration scenarios Simplify dependencies and configuration: - Remove offline-specific requirement files - Update pyproject.toml with streamlined dependencies - Add Python version pinning with .python-version - Create project guidelines in CLAUDE.md and AGENTS.md	2025-12-12 16:28:49 +01:00
clssck	da9070ecf7	refactor: remove legacy storage implementations and k8s deployment Remove deprecated storage backends and Kubernetes deployment configuration: - Delete unused storage implementations: FAISS, JSON, Memgraph, Milvus, MongoDB, Nano Vector DB, Neo4j, NetworkX, Qdrant, Redis - Remove Kubernetes deployment manifests and installation scripts - Delete legacy examples for deprecated backends - Consolidate to PostgreSQL-only storage backend Streamline dependencies and add new capabilities: - Remove deprecated code documentation and migration guides - Add full-text search caching layer with FTS cache module - Implement metrics collection and monitoring pipeline - Add explain and metrics API routes - Simplify configuration with PostgreSQL-focused setup Update documentation and configuration: - Rewrite README to focus on supported features - Update environment and configuration examples - Remove Kubernetes-specific documentation - Add new utility scripts for PDF uploads and pipeline monitoring	2025-12-09 14:02:00 +01:00
clssck	dd1413f3eb	test(lightrag,examples): add prompt accuracy and quality tests Add comprehensive test suites for prompt evaluation: - test_prompt_accuracy.py: 365 lines testing prompt extraction accuracy - test_prompt_quality_deep.py: 672 lines for deep quality analysis - Refactor prompt.py to consolidate optimized variants (removed prompt_optimized.py) - Apply ruff formatting and type hints across 30 files - Update pyrightconfig.json for static type checking - Modernize reproduce scripts and examples with improved type annotations - Sync uv.lock dependencies	2025-12-05 16:39:52 +01:00
clssck	69358d830d	test(lightrag,examples,api): comprehensive ruff formatting and type hints Format entire codebase with ruff and add type hints across all modules: - Apply ruff formatting to all Python files (121 files, 17K insertions) - Add type hints to function signatures throughout lightrag core and API - Update test suite with improved type annotations and docstrings - Add pyrightconfig.json for static type checking configuration - Create prompt_optimized.py and test_extraction_prompt_ab.py test files - Update ruff.toml and .gitignore for improved linting configuration - Standardize code style across examples, reproduce scripts, and utilities	2025-12-05 15:17:06 +01:00
clssck	1bdd906753	chore(lightrag): remove legacy prompts and clean up prompt.py Remove unused LLM-generated citation prompts that were kept for backward compatibility but never referenced in codebase. Consolidate duplicate instructions in entity summarization prompt and fix minor typos. - Remove rag_response_with_llm_citations prompt (dead code) - Remove naive_rag_response_with_llm_citations prompt (dead code) - Remove unused cite_ready_* backward compatibility aliases - Consolidate duplicate context/objectivity instructions in summarize prompt - Fix typo in example (extra parenthesis) - Clarify delimiter documentation comment	2025-12-01 21:02:44 +01:00
clssck	663ada943a	chore: add citation system and enhance RAG UI components Add citation tracking and display system across backend and frontend components. Backend changes include citation.py for document attribution, enhanced query routes with citation metadata, improved prompt templates, and PostgreSQL schema updates. Frontend includes CitationMarker component, HoverCard UI, QuerySettings refinements, and ChatMessage enhancements for displaying document sources. Update dependencies and docker-compose test configuration for improved development workflow.	2025-12-01 17:50:00 +01:00
clssck	d2c9e6e2ec	test(lightrag): add orphan connection feature with quality validation tests Implement automatic orphan entity connection system that identifies entities with no relationships and creates meaningful connections via vector similarity + LLM validation. This improves knowledge graph connectivity and retrieval quality. Changes: - Add orphan connection configuration parameters (thresholds, cross-connect settings) - Implement aconnect_orphan_entities() method with 4-step validation pipeline - Add SQL templates for efficient orphan and candidate entity queries - Create POST /graph/orphans/connect API endpoint with configurable parameters - Add orphan connection validation prompt for LLM-based relationship verification - Include relationship density requirement in extraction prompts to prevent orphans - Update docker-compose.test.yml with optimized extraction parameters - Add quality validation test suite (run_quality_tests.py) for retrieval evaluation - Add unit test framework (test_orphan_connection_quality.py) with test cases - Enable auto-run of orphan connection after document processing	2025-11-28 18:23:30 +01:00
Daniel.y	d392db7b4a	Fix typo in 'equipment' in prompt.py	2025-10-22 11:13:22 +08:00
yangdx	6bf6f43d96	Remove bold formatting from instruction headers in prompts	2025-10-02 00:58:03 +08:00
yangdx	bb6138e748	fix(prompt): Clarify reference section restrictions in prompt templates	2025-10-01 22:35:26 +08:00
yangdx	37e8898cf6	Simplify reference formatting in LLM context generation - Remove extra newlines in reference lists - Change code block type from text to generic	2025-10-01 22:20:58 +08:00
yangdx	f83cde14df	fix(prompt): Improve markdown formatting requirements and reference style	2025-10-01 21:41:12 +08:00
yangdx	0fd0186414	Improve prompt clarity by standardizing terminology and formatting • Replace "Source Data" with "Context" • Add bold formatting for key sections • Clarify reference_id usage • Improve JSON/text block formatting • Standardize data source naming	2025-09-28 13:31:55 +08:00
yangdx	cbdc4c4bdf	Refactor prompts and context building for better maintainability - Extract context templates to PROMPTS - Unify token calculation logic - Simplify user_prompt formatting - Reduce code duplication - Improve prompt structure consistency	2025-09-26 12:39:06 +08:00
yangdx	fba2356c81	Move user_prompt to system prompt - Refactor query prompt handling to separate user prompts in system context - Simplify user_query to only contain query - Apply changes to both kg_query and naive_query	2025-09-26 10:02:01 +08:00
yangdx	058ce83dba	Clarify citation format and fix typo	2025-09-25 20:08:55 +08:00
yangdx	41a6da6786	Remove inline citation instructions from prompt templates - Remove footnote syntax guidelines - Delete inline citation examples - Keep references section format - Simplify citation documentation - Update example section titles	2025-09-25 03:46:30 +08:00
yangdx	14bbafa146	Improve inline citation format and add examples to prompts - Clarify single caret rule for citations - Add citation format examples - Rename to "References Section Format" - Improve multi-citation instructions	2025-09-25 03:26:50 +08:00
yangdx	6177878812	Add inline citation format with footnote syntax to prompts - Add footnote syntax `[^1]` for citations - Support multiple citations `[^1,2,3]` - Update reference section examples - Enforce caret symbol requirement - Match reference_id in brackets	2025-09-25 02:51:12 +08:00
yangdx	f610bd5d21	Update citation format to use bullet points and add examples - Change citation format to `* [n]` - Add reference section examples - Apply to both prompt templates - Improve formatting consistency	2025-09-24 21:59:21 +08:00
yangdx	e9503ee6ae	Merge branch 'patch-1' into citation-optimization	2025-09-24 18:23:29 +08:00
yangdx	ac26f3a2f2	Refactor citation format from file paths to numbered document titles • Change citation format to [n] style • Reduce max citations from 6 to 5 • Add reference tracking instructions • Simplify citation merge logic • Remove inline citation requirements	2025-09-24 14:30:53 +08:00
SASon	b3cc0127d9	Fix typo in output language instruction	2025-09-24 13:22:35 +09:00
SASon	746d4c576d	Fix typo in output language instruction from Oputput to Output	2025-09-24 13:17:37 +09:00
yangdx	5fa92cbf99	Improve citation quality and reduce reference limits in prompts - Reduce max citations from 8 to 6 - Require direct fact referencing - Clarify relevance prioritization	2025-09-22 10:53:03 +08:00
yangdx	8826d2f892	Optimize prompt instruction for citation format	2025-09-22 01:04:57 +08:00
yangdx	2f06f851c3	Enhance citation format with merged references and clearer guidelines - Increase max references from 5 to 8 - Merge citations by file_path - Remove inline citations from body - Add reference section examples - Update citation prefixes (KG→EN, RE)	2025-09-21 22:48:48 +08:00
yangdx	f88c2fbdff	Refactor citation format instructions for clarity and consistency	2025-09-21 15:51:31 +08:00
yangdx	8f0fb3c9eb	Include user query in prompt returns	2025-09-21 15:24:20 +08:00
yangdx	6eb37e270a	Refactor query handling and improve RAG response prompts - Move user_prompt to query concatenation - Remove DEFAULT_USER_PROMPT constant - Enhance prompt clarity and structure - Standardize citation formatting - Improve step-by-step instructions	2025-09-21 15:16:24 +08:00
yangdx	f69c5dfd9a	Add language control and format clarity to extraction prompts	2025-09-14 18:26:41 +08:00
yangdx	6e37460964	Improve entity extraction prompt clarity and make sure LLM output content only	2025-09-14 17:50:56 +08:00
yangdx	4de1473875	Improve entity extraction prompts and error message formatting • Fix typo in error log message • Clarify format requirements in prompts • Make extraction instructions clearer • Improve user prompt consistency	2025-09-14 13:45:59 +08:00
yangdx	fd48afdb00	Use "relation" instead of "relationship" in extration prompt, and support both format for safty	2025-09-14 11:43:35 +08:00
yangdx	d993464a92	Restructure entity extraction prompt with clearer formatting and examples * Improved instruction clarity * Added better formatting structure * Enhanced delimiter usage rules * Clarified relationship handling * Better third-person guidelines	2025-09-14 02:30:32 +08:00
yangdx	2686fc526e	Change entity type from CreativeWork to Content and update delimiter • Replace CreativeWork with Content type • Improve LLM output error messages • Update prompt for binary relationships • Fix delimiter corruption examples	2025-09-14 00:55:15 +08:00
yangdx	4a5ab5121d	Change delimiter from <\|S\|> to <\|#\|> and clarify formatting rules	2025-09-13 22:58:56 +08:00
yangdx	bf423a4ce1	Clarify output structure in prompt instructions by adding field count specifications	2025-09-13 09:51:33 +08:00
yangdx	369f799b16	Refine entity extraction prompts for clarity and consistency • Clarify tuple delimiter usage • Soften proper noun translation rules • Standardize language requirements • Improve output format consistency	2025-09-13 08:14:46 +08:00
yangdx	0221213b9b	Improve entity summarization with JSONL format and fix tuple delimiters • Convert descriptions to JSONL format • Add token-based truncation helper • Enhance entity name consistency rules • Improve summarization prompt clarity • Fix tuple delimiter corruption patterns	2025-09-12 12:32:08 +08:00
yangdx	1892ed23cc	Change tuple delimiter from <\|SEP\|> to <\|S\|> across codebase • Update prompt instruction clarity • Correct utility function examples • Update regex pattern comments	2025-09-12 08:57:46 +08:00
yangdx	b96f1484ec	Shorten tuple delimiter to <\|S\|> and refine relationship extraction text • Remove redundant "within input text" • Clarify relationship extraction scope	2025-09-12 08:36:43 +08:00
yangdx	40688def20	Refactor tuple delimiter corruption fix into reusable utility function - Extract regex fixes to utils module - Add case-insensitive delimiter handling	2025-09-12 04:10:14 +08:00
yangdx	7f83a58497	Refactor extraction delimiters from ## to newlines and change tuple delimiter to <\|SEP\|> • Add robust delimiter fixing logic • Update prompts for single-line format	2025-09-11 13:44:44 +08:00
yangdx	02e7462645	feat: enhance LLM output format tolerance for bracket processing - Expand bracket tolerance to support additional characters: < > " ' - Implement symmetric handling for both leading and trailing characters - Replace simple string matching with robust regex-based pattern detection - Maintain full backward compatibility with existing bracket formats	2025-09-10 18:10:06 +08:00
yangdx	50fddeebbf	fix: Remove conversation history from prompt template - Delete history section from prompt - Simplify user query response format - Remove {history} placeholder variable	2025-09-10 12:07:34 +08:00
yangdx	2dd143c935	Refactor conversation history handling to use LLM native message format • Remove get_conversation_turns utility • Pass history_messages to LLM directly • Clean up prompt template formatting	2025-09-10 11:56:58 +08:00
yangdx	06db511f3b	Remove angle brackets from entity and relationship output formats	2025-09-09 09:21:23 +08:00
yangdx	d218f15a62	Refactor entity extraction with system prompts and output limits - Add system/user prompt separation - Set max tokens for endless output fix - Improve extraction error logging - Update cache type from extract to summary	2025-09-08 15:20:45 +08:00

1 2 3

128 commits