Commit graph

58 commits

Author SHA1 Message Date
clssck
663ada943a chore: add citation system and enhance RAG UI components
Add citation tracking and display system across backend and frontend components.
Backend changes include citation.py for document attribution, enhanced query routes
with citation metadata, improved prompt templates, and PostgreSQL schema updates.
Frontend includes CitationMarker component, HoverCard UI, QuerySettings refinements,
and ChatMessage enhancements for displaying document sources. Update dependencies
and docker-compose test configuration for improved development workflow.
2025-12-01 17:50:00 +01:00
yangdx
ab32456a79 Refactor entity merging with unified attribute merge function
• Update GRAPH_FIELD_SEP comment clarity
• Deprecate merge_strategy parameter
• Unify entity/relation merge logic
• Add join_unique_comma strategy
2025-10-27 00:04:17 +08:00
yangdx
904b1f46f9 Add entity name length truncation with configurable limit 2025-10-22 14:02:30 +08:00
yangdx
88a45523e2 Increase default max file paths from 30 to 100 and improve documentation
- Bump DEFAULT_MAX_FILE_PATHS to 100
- Add clarifying comment about display
2025-10-21 17:33:00 +08:00
yangdx
3ad616be4f Change default source IDs limit method from KEEP to FIFO 2025-10-21 16:12:11 +08:00
yangdx
1248b3ab04 Increase default limits for source IDs and file paths in metadata
• Entity source IDs: 3 → 300
• Relation source IDs: 3 → 300
• File paths: 2 → 30
2025-10-21 05:30:09 +08:00
yangdx
e0fd31a60d Fix logging message formatting 2025-10-20 22:09:09 +08:00
yangdx
a9fec26798 Add file path limit configuration for entities and relations
• Add MAX_FILE_PATHS env variable
• Implement file path count limiting
• Support KEEP/FIFO strategies
• Add truncation placeholder
• Remove old build_file_path function
2025-10-20 20:12:53 +08:00
yangdx
dc62c78f98 Add entity/relation chunk tracking with configurable source ID limits
- Add entity_chunks & relation_chunks storage
- Implement KEEP/FIFO limit strategies
- Update env.example with new settings
- Add migration for chunk tracking data
- Support all KV storage
2025-10-20 15:24:15 +08:00
DivinesLight
c06522b927 Get max source Id config from .env and lightRAG init 2025-10-15 18:24:38 +05:00
DivinesLight
54f0a7d1ca Quick fix to limit source_id ballooning while inserting nodes 2025-10-14 14:47:04 +05:00
yangdx
699ca3ba00 Remove deprecated history_turns and ids parameters from query API endpoint
• Update QueryParam documentation
• Mark history_turns as deprecated
• Clean up splash screen display
• Clarify conversation_history usage
2025-09-25 04:58:57 +08:00
yangdx
9dd1790b5c Add "Creature" entity type and reorganize type mappings
- Add Creature to default entity types
- Map animals/beings to creature type
2025-09-23 21:58:33 +08:00
yangdx
5311083f43 Rename "Process" entity type to "Method" across all components 2025-09-14 02:30:05 +08:00
yangdx
7060cf17f0 Add Process and Data entity types to LLM extraction system
• Add Process and Data to default types
• Update env.example configuration
• Add translations for new entities
• Support 5 languages (en/zh/fr/ar/tw)
2025-09-14 01:14:47 +08:00
yangdx
2686fc526e Change entity type from CreativeWork to Content and update delimiter
• Replace CreativeWork with Content type
• Improve LLM output error messages
• Update prompt for binary relationships
• Fix delimiter corruption examples
2025-09-14 00:55:15 +08:00
yangdx
41cdeaeaad Add Concept and NaturalObject to default entity types 2025-09-13 15:37:11 +08:00
yangdx
f3b5352019 Refine default entity types 2025-09-13 11:17:06 +08:00
yangdx
8d53ef7ff0 Increase default Gunicorn worker timeout from 210 to 300 seconds 2025-09-08 20:03:21 +08:00
yangdx
78abb397bf Reorder entity types and add Document type to extraction 2025-09-03 12:44:40 +08:00
yangdx
9d81cd724a Fix typo: change "Equiment" to "Equipment" in entity types 2025-09-02 03:19:31 +08:00
yangdx
4e751e0653 refac: Enhance extraction with improved prompts and parser
-   **Prompts**: Restructured prompts with clearer steps and quality guidelines. Simplified the relationship tuple by removing `relationship_strength`
-   **Model**: Updated default entity types to be more comprehensive and consistently capitalized (e.g., `Location`, `Product`)
2025-08-31 22:24:11 +08:00
yangdx
925e631a9a refac: Add robust time out handling for LLM request 2025-08-29 13:50:35 +08:00
yangdx
8a0d06e557 Restore default entity types 2025-08-27 12:51:18 +08:00
yangdx
ff0a18e08c Unify SUMMARY_LANGUANGE and ENTITY_TYPES implementation method 2025-08-27 12:23:22 +08:00
Thibo Rosemplatt
c3aabfc251 Merge branch 'main' into entityTypesServerSupport 2025-08-26 21:48:20 +02:00
yangdx
6bcfe696ee feat: add output length recommendation and description type to LLM summary
- Add SUMMARY_LENGTH_RECOMMENDED parameter (600 tokens)
- Optimize prompt temple for LLM summary
2025-08-26 14:41:12 +08:00
yangdx
84416d104d Increase default LLM summary merge threshold from 4 to 8 for reducing summary trigger frequency 2025-08-26 03:57:35 +08:00
yangdx
de2daf6565 refac: Rename summary_max_tokens to summary_context_size, comprehensive parameter validation for summary configuration
- Update algorithm logic in operate.py for better token management
- Fix health endpoint to use correct parameter names
2025-08-26 01:35:50 +08:00
Thibo Rosemplatt
d054ec5d00 Added entity_types as a user defined variable (via .env) 2025-08-23 20:16:11 +02:00
yangdx
47485b130d refac(ui): Show rerank binding info on status card
- Remove separate ENABLE_RERANK flag in favor of rerank_binding="null"
- Change default rerank binding from "cohere" to "null" (disabled)
- Update UI to display both rerank binding and model information
2025-08-23 02:04:14 +08:00
yangdx
bf43e1b8c1 fix: Resolve default rerank config problem when env var missing
- Read config from selected_rerank_func when env var missing
- Make api_key optional for rerank function
- Add response format validation with proper error handling
- Update Cohere rerank default to official API endpoint
2025-08-23 01:07:59 +08:00
yangdx
16a1ef1178 Update summary_max_tokens default from 10k to 30k tokens 2025-08-21 23:16:07 +08:00
yangdx
4c556d8aae Set default TIMEOUT value to 150, and gunicorn timeout to TIMEOUT+30 2025-08-20 22:04:32 +08:00
yangdx
d5e8f1e860 Update default query parameters for better performance
- Increase chunk_top_k from 10 to 20
- Reduce max_entity_tokens to 6000
- Reduce max_relation_tokens to 8000
- Update web UI default values
- Fix max_total_tokens to 30000
2025-08-18 19:32:11 +08:00
yangdx
dcec511f72 feat: increase file path length limit to 32768 and add schema migration for Milvus DB
- Bump path limit to 32768 chars
- Add migration detection logic
- Implement dual-client migration
- Auto-migrate old collections
2025-08-18 04:37:12 +08:00
yangdx
5a40ff654e Change KG chunk selection default to VECTOR
- Set KG_CHUNK_PICK_METHOD default to VECTOR
- Update env.example with new config option
2025-08-13 23:10:42 +08:00
yangdx
f1dafa0d01 feat: KG related chunks selection by vector similarity
- Add env switch to toggle weighted polling vs vector-similarity strategy
- Implement similarity-based sorting with fallback to weighted
- Introduce batch vector read API for vector storage
- Implement vector store and retrive funtion for Nanovector DB
- Preserve default behavior (weighted polling selection method)
2025-08-13 18:16:42 +08:00
yangdx
9d5603d35e Set the default LLM temperature to 1.0 and centralize constant management 2025-07-31 17:15:10 +08:00
yangdx
c6bd9f0329 Disable conversation history by default
- Set default history_turns to 0
- Mark history_turns as deprecated
- Remove history_turns from example
- Update documentation comments
2025-07-31 12:28:42 +08:00
yangdx
f2ffff063b feat: refactor ollama server configuration management
- Add ollama_server_infos attribute to LightRAG class with default initialization
- Move default values to constants.py for centralized configuration
- Refactor OllamaServerInfos class with property accessors and CLI support
- Update OllamaAPI to get configuration through rag object instead of direct import
- Add command line arguments for simulated model name and tag
- Fix type imports to avoid circular dependencies
2025-07-28 01:38:35 +08:00
yangdx
598eecd06d Refactor: Rename llm_model_max_token_size to summary_max_tokens
This commit renames the parameter 'llm_model_max_token_size' to 'summary_max_tokens' for better clarity, as it specifically controls the token limit for entity relation summaries.
2025-07-28 00:49:08 +08:00
yangdx
d0d57a45b6 feat: add environment variables to /health endpoint and centralize defaults
- Add 9 environment variables to /health endpoint configuration section
- Centralize default constants in lightrag/constants.py for consistency
- Update config.py to use centralized defaults for better maintainability
2025-07-28 00:30:56 +08:00
yangdx
a9565d7379 feat: Skip rerank filtering when min_rerank_score is 0.0 2025-07-27 16:50:12 +08:00
yangdx
ebaff228aa feat: Add rerank score filtering with configurable threshold
- Add DEFAULT_MIN_RERANK_SCORE constant (default: 0.0)
- Add MIN_RERANK_SCORE environment variable support
- Filter chunks with rerank scores below threshold in process_chunks_unified
- Add info-level logging for filtering operations
- Handle empty results gracefully after filtering
- Maintain backward compatibility with non-reranked chunks
2025-07-27 16:37:44 +08:00
yangdx
055629d30d Reduce default max total tokens to 30k 2025-07-27 10:33:06 +08:00
yangdx
c8c3545454 refactor: extract file path length limit to shared constant
• Add DEFAULT_MAX_FILE_PATH_LENGTH constant
• Replace hardcoded 4090 in Milvus impl
2025-07-26 10:45:03 +08:00
yangdx
2c940f0728 reduce RELATED_CHUNK_NUMBER from 10 to 5 2025-07-24 02:49:05 +08:00
yangdx
8103b200db Set DEFAULT_HISTORY_TURNS to 0 2025-07-16 02:20:27 +08:00
yangdx
6e084bfae1 Increase default related chunk number from 5 to 10 2025-07-16 00:22:34 +08:00