yangdx
ab32456a79
Refactor entity merging with unified attribute merge function
...
• Update GRAPH_FIELD_SEP comment clarity
• Deprecate merge_strategy parameter
• Unify entity/relation merge logic
• Add join_unique_comma strategy
2025-10-27 00:04:17 +08:00
yangdx
904b1f46f9
Add entity name length truncation with configurable limit
2025-10-22 14:02:30 +08:00
yangdx
88a45523e2
Increase default max file paths from 30 to 100 and improve documentation
...
- Bump DEFAULT_MAX_FILE_PATHS to 100
- Add clarifying comment about display
2025-10-21 17:33:00 +08:00
yangdx
3ad616be4f
Change default source IDs limit method from KEEP to FIFO
2025-10-21 16:12:11 +08:00
yangdx
1248b3ab04
Increase default limits for source IDs and file paths in metadata
...
• Entity source IDs: 3 → 300
• Relation source IDs: 3 → 300
• File paths: 2 → 30
2025-10-21 05:30:09 +08:00
yangdx
e0fd31a60d
Fix logging message formatting
2025-10-20 22:09:09 +08:00
yangdx
a9fec26798
Add file path limit configuration for entities and relations
...
• Add MAX_FILE_PATHS env variable
• Implement file path count limiting
• Support KEEP/FIFO strategies
• Add truncation placeholder
• Remove old build_file_path function
2025-10-20 20:12:53 +08:00
yangdx
dc62c78f98
Add entity/relation chunk tracking with configurable source ID limits
...
- Add entity_chunks & relation_chunks storage
- Implement KEEP/FIFO limit strategies
- Update env.example with new settings
- Add migration for chunk tracking data
- Support all KV storage
2025-10-20 15:24:15 +08:00
DivinesLight
c06522b927
Get max source Id config from .env and lightRAG init
2025-10-15 18:24:38 +05:00
DivinesLight
54f0a7d1ca
Quick fix to limit source_id ballooning while inserting nodes
2025-10-14 14:47:04 +05:00
yangdx
699ca3ba00
Remove deprecated history_turns and ids parameters from query API endpoint
...
• Update QueryParam documentation
• Mark history_turns as deprecated
• Clean up splash screen display
• Clarify conversation_history usage
2025-09-25 04:58:57 +08:00
yangdx
9dd1790b5c
Add "Creature" entity type and reorganize type mappings
...
- Add Creature to default entity types
- Map animals/beings to creature type
2025-09-23 21:58:33 +08:00
yangdx
5311083f43
Rename "Process" entity type to "Method" across all components
2025-09-14 02:30:05 +08:00
yangdx
7060cf17f0
Add Process and Data entity types to LLM extraction system
...
• Add Process and Data to default types
• Update env.example configuration
• Add translations for new entities
• Support 5 languages (en/zh/fr/ar/tw)
2025-09-14 01:14:47 +08:00
yangdx
2686fc526e
Change entity type from CreativeWork to Content and update delimiter
...
• Replace CreativeWork with Content type
• Improve LLM output error messages
• Update prompt for binary relationships
• Fix delimiter corruption examples
2025-09-14 00:55:15 +08:00
yangdx
41cdeaeaad
Add Concept and NaturalObject to default entity types
2025-09-13 15:37:11 +08:00
yangdx
f3b5352019
Refine default entity types
2025-09-13 11:17:06 +08:00
yangdx
8d53ef7ff0
Increase default Gunicorn worker timeout from 210 to 300 seconds
2025-09-08 20:03:21 +08:00
yangdx
78abb397bf
Reorder entity types and add Document type to extraction
2025-09-03 12:44:40 +08:00
yangdx
9d81cd724a
Fix typo: change "Equiment" to "Equipment" in entity types
2025-09-02 03:19:31 +08:00
yangdx
4e751e0653
refac: Enhance extraction with improved prompts and parser
...
- **Prompts**: Restructured prompts with clearer steps and quality guidelines. Simplified the relationship tuple by removing `relationship_strength`
- **Model**: Updated default entity types to be more comprehensive and consistently capitalized (e.g., `Location`, `Product`)
2025-08-31 22:24:11 +08:00
yangdx
925e631a9a
refac: Add robust time out handling for LLM request
2025-08-29 13:50:35 +08:00
yangdx
8a0d06e557
Restore default entity types
2025-08-27 12:51:18 +08:00
yangdx
ff0a18e08c
Unify SUMMARY_LANGUANGE and ENTITY_TYPES implementation method
2025-08-27 12:23:22 +08:00
Thibo Rosemplatt
c3aabfc251
Merge branch 'main' into entityTypesServerSupport
2025-08-26 21:48:20 +02:00
yangdx
6bcfe696ee
feat: add output length recommendation and description type to LLM summary
...
- Add SUMMARY_LENGTH_RECOMMENDED parameter (600 tokens)
- Optimize prompt temple for LLM summary
2025-08-26 14:41:12 +08:00
yangdx
84416d104d
Increase default LLM summary merge threshold from 4 to 8 for reducing summary trigger frequency
2025-08-26 03:57:35 +08:00
yangdx
de2daf6565
refac: Rename summary_max_tokens to summary_context_size, comprehensive parameter validation for summary configuration
...
- Update algorithm logic in operate.py for better token management
- Fix health endpoint to use correct parameter names
2025-08-26 01:35:50 +08:00
Thibo Rosemplatt
d054ec5d00
Added entity_types as a user defined variable (via .env)
2025-08-23 20:16:11 +02:00
yangdx
47485b130d
refac(ui): Show rerank binding info on status card
...
- Remove separate ENABLE_RERANK flag in favor of rerank_binding="null"
- Change default rerank binding from "cohere" to "null" (disabled)
- Update UI to display both rerank binding and model information
2025-08-23 02:04:14 +08:00
yangdx
bf43e1b8c1
fix: Resolve default rerank config problem when env var missing
...
- Read config from selected_rerank_func when env var missing
- Make api_key optional for rerank function
- Add response format validation with proper error handling
- Update Cohere rerank default to official API endpoint
2025-08-23 01:07:59 +08:00
yangdx
16a1ef1178
Update summary_max_tokens default from 10k to 30k tokens
2025-08-21 23:16:07 +08:00
yangdx
4c556d8aae
Set default TIMEOUT value to 150, and gunicorn timeout to TIMEOUT+30
2025-08-20 22:04:32 +08:00
yangdx
d5e8f1e860
Update default query parameters for better performance
...
- Increase chunk_top_k from 10 to 20
- Reduce max_entity_tokens to 6000
- Reduce max_relation_tokens to 8000
- Update web UI default values
- Fix max_total_tokens to 30000
2025-08-18 19:32:11 +08:00
yangdx
dcec511f72
feat: increase file path length limit to 32768 and add schema migration for Milvus DB
...
- Bump path limit to 32768 chars
- Add migration detection logic
- Implement dual-client migration
- Auto-migrate old collections
2025-08-18 04:37:12 +08:00
yangdx
5a40ff654e
Change KG chunk selection default to VECTOR
...
- Set KG_CHUNK_PICK_METHOD default to VECTOR
- Update env.example with new config option
2025-08-13 23:10:42 +08:00
yangdx
f1dafa0d01
feat: KG related chunks selection by vector similarity
...
- Add env switch to toggle weighted polling vs vector-similarity strategy
- Implement similarity-based sorting with fallback to weighted
- Introduce batch vector read API for vector storage
- Implement vector store and retrive funtion for Nanovector DB
- Preserve default behavior (weighted polling selection method)
2025-08-13 18:16:42 +08:00
yangdx
9d5603d35e
Set the default LLM temperature to 1.0 and centralize constant management
2025-07-31 17:15:10 +08:00
yangdx
c6bd9f0329
Disable conversation history by default
...
- Set default history_turns to 0
- Mark history_turns as deprecated
- Remove history_turns from example
- Update documentation comments
2025-07-31 12:28:42 +08:00
yangdx
f2ffff063b
feat: refactor ollama server configuration management
...
- Add ollama_server_infos attribute to LightRAG class with default initialization
- Move default values to constants.py for centralized configuration
- Refactor OllamaServerInfos class with property accessors and CLI support
- Update OllamaAPI to get configuration through rag object instead of direct import
- Add command line arguments for simulated model name and tag
- Fix type imports to avoid circular dependencies
2025-07-28 01:38:35 +08:00
yangdx
598eecd06d
Refactor: Rename llm_model_max_token_size to summary_max_tokens
...
This commit renames the parameter 'llm_model_max_token_size' to 'summary_max_tokens' for better clarity, as it specifically controls the token limit for entity relation summaries.
2025-07-28 00:49:08 +08:00
yangdx
d0d57a45b6
feat: add environment variables to /health endpoint and centralize defaults
...
- Add 9 environment variables to /health endpoint configuration section
- Centralize default constants in lightrag/constants.py for consistency
- Update config.py to use centralized defaults for better maintainability
2025-07-28 00:30:56 +08:00
yangdx
a9565d7379
feat: Skip rerank filtering when min_rerank_score is 0.0
2025-07-27 16:50:12 +08:00
yangdx
ebaff228aa
feat: Add rerank score filtering with configurable threshold
...
- Add DEFAULT_MIN_RERANK_SCORE constant (default: 0.0)
- Add MIN_RERANK_SCORE environment variable support
- Filter chunks with rerank scores below threshold in process_chunks_unified
- Add info-level logging for filtering operations
- Handle empty results gracefully after filtering
- Maintain backward compatibility with non-reranked chunks
2025-07-27 16:37:44 +08:00
yangdx
055629d30d
Reduce default max total tokens to 30k
2025-07-27 10:33:06 +08:00
yangdx
c8c3545454
refactor: extract file path length limit to shared constant
...
• Add DEFAULT_MAX_FILE_PATH_LENGTH constant
• Replace hardcoded 4090 in Milvus impl
2025-07-26 10:45:03 +08:00
yangdx
2c940f0728
reduce RELATED_CHUNK_NUMBER from 10 to 5
2025-07-24 02:49:05 +08:00
yangdx
8103b200db
Set DEFAULT_HISTORY_TURNS to 0
2025-07-16 02:20:27 +08:00
yangdx
6e084bfae1
Increase default related chunk number from 5 to 10
2025-07-16 00:22:34 +08:00
yangdx
5f7cb437e8
Centralize query parameters into LightRAG class
...
This commit refactors query parameter management by consolidating settings like `top_k`, token limits, and thresholds into the `LightRAG` class, and consistently sourcing parameters from a single location.
2025-07-15 23:56:49 +08:00