Commit graph

317 commits

Author SHA1 Message Date
yangdx
d5e8f1e860 Update default query parameters for better performance
- Increase chunk_top_k from 10 to 20
- Reduce max_entity_tokens to 6000
- Reduce max_relation_tokens to 8000
- Update web UI default values
- Fix max_total_tokens to 30000
2025-08-18 19:32:11 +08:00
yangdx
dc7a6e1c5b Update README 2025-08-16 06:15:27 +08:00
yangdx
0b5c708660 Update storage implementation documentation
- Add detailed storage type descriptions
- Remove Chroma from vector storage options
- Include recommended PostgreSQL version
- Add Memgraph to graph storage options
- Update performance comparison notes
2025-08-05 18:03:51 +08:00
yangdx
32af45ff46 refactor: improve JSON parsing reliability with json-repair library
Replace regex-based JSON extraction with json-repair for better handling of malformed LLM responses. Remove deprecated JSON parsing utilities and clean up keyword_extraction parameter across LLM providers.

- Remove locate_json_string_body_from_string() and convert_response_to_json()
- Use json-repair.loads() in extract_keywords_only() for robust parsing
- Clean up LLM interfaces and remove unused parameters
- Add json-repair dependency
2025-08-01 19:36:20 +08:00
yangdx
3c530b21b6 Update README 2025-07-31 13:00:09 +08:00
yangdx
c6bd9f0329 Disable conversation history by default
- Set default history_turns to 0
- Mark history_turns as deprecated
- Remove history_turns from example
- Update documentation comments
2025-07-31 12:28:42 +08:00
yangdx
aba46213a7 Update README 2025-07-30 13:13:59 +08:00
yangdx
9923821d75 refactor: Remove deprecated max_token_size from embedding configuration
This parameter is no longer used. Its removal simplifies the API and clarifies that token length management is handled by upstream text chunking logic rather than the embedding wrapper.
2025-07-29 10:49:35 +08:00
yangdx
598eecd06d Refactor: Rename llm_model_max_token_size to summary_max_tokens
This commit renames the parameter 'llm_model_max_token_size' to 'summary_max_tokens' for better clarity, as it specifically controls the token limit for entity relation summaries.
2025-07-28 00:49:08 +08:00
Ákos Lukács
f115661e16
Fix "A Simple Program" example in README.md
The example should use ainsert and aquery. Fixes #1723
2025-07-22 14:37:15 +02:00
yangdx
80f7e37168 Fix default workspace name for PostgreSQL AGE graph storage 2025-07-16 19:16:22 +08:00
yangdx
1c53c5c764 Update README.md 2025-07-16 11:10:56 +08:00
yangdx
47341d3a71 Merge branch 'main' into rerank 2025-07-15 16:12:33 +08:00
yangdx
e8e1f6ab56 feat: centralize environment variable defaults in constants.py 2025-07-15 16:11:50 +08:00
yangdx
ccc2a20071 feat: remove deprecated MAX_TOKEN_SUMMARY parameter to prevent LLM output truncation
- Remove MAX_TOKEN_SUMMARY parameter and related configurations
- Eliminate forced token-based truncation in entity/relationship descriptions
- Switch to fragment-count based summarization logic using FORCE_LLM_SUMMARY_ON_MERGE
- Update FORCE_LLM_SUMMARY_ON_MERGE default from 6 to 4 for better summarization
- Clean up documentation, environment examples, and API display code
- Preserve backward compatibility by graceful parameter removal

This change resolves issues where LLMs were forcibly truncating entity relationship
descriptions mid-sentence, leading to incomplete and potentially inaccurate knowledge
graph content. The new approach allows LLMs to generate complete descriptions while
still providing summarization when multiple fragments need to be merged.

Breaking Change: None - parameter removal is backward compatible
Fixes: Entity relationship description truncation issues
2025-07-15 12:26:33 +08:00
zrguo
7c882313bb remove chunk_rerank_top_k 2025-07-15 11:52:34 +08:00
zrguo
4e425b1b59 Revert "update from main"
This reverts commit 1d0376d6a9.
2025-07-14 16:29:00 +08:00
zrguo
1d0376d6a9 update from main 2025-07-14 16:27:49 +08:00
zrguo
c9cbd2d3e0 Merge branch 'main' into rerank 2025-07-14 16:24:29 +08:00
zrguo
ef2115d437 Update token limit 2025-07-14 15:53:48 +08:00
yangdx
b03bb48e24 feat: Refine summary logic and add dedicated Ollama num_ctx config
- Refactor the trigger condition for LLM-based summarization of entities and relations. Instead of relying on character length, the summary is now triggered when the number of merged description fragments exceeds a configured threshold. This provides a more robust and logical condition for consolidation.
- Introduce the `OLLAMA_NUM_CTX` environment variable to explicitly configure the context window size (`num_ctx`) for Ollama models. This decouples the model's context length from the `MAX_TOKENS` parameter, which is now specifically used to limit input for summary generation, making the configuration clearer and more flexible.
- Updated `README` files, `env.example`, and default values to reflect these changes.
2025-07-14 01:55:04 +08:00
yangdx
9aa2ed0837 Merge branch 'main' into rerank 2025-07-09 15:33:39 +08:00
yangdx
e457374224 Fix linting 2025-07-09 15:33:05 +08:00
yangdx
bfa0844ecb Update README 2025-07-09 15:17:05 +08:00
yangdx
5d4484882a Merge branch 'main' into rerank 2025-07-09 03:59:04 +08:00
zrguo
04a57445da update chunks truncation method 2025-07-08 13:31:05 +08:00
yangdx
2670f8dc98 Merge branch 'main' into add-Memgraph-graph-db 2025-07-08 00:31:46 +08:00
yangdx
f417118e27 Center banner text dynamically 2025-07-07 17:28:59 +08:00
yangdx
13b2c93eec Update README.md 2025-07-07 16:44:30 +08:00
yangdx
a567601da2 Merge branch 'main' into add-Memgraph-graph-db 2025-07-05 13:14:39 +08:00
yangdx
bdfd2d53c7 Fix linting 2025-07-05 11:43:45 +08:00
yangdx
2e2b9f3b48 Refactor setup.py to utilize pyproject.toml for project installation. 2025-07-05 11:19:00 +08:00
DavIvek
80d4d5b0d5 Add Memgraph into README.md 2025-06-26 16:26:51 +02:00
zrguo
5b6ac84cd0 Update README 2025-06-26 18:08:21 +08:00
zrguo
082a338df3 Update README.md 2025-06-26 17:52:52 +08:00
zrguo
afba3df01a Update README 2025-06-26 16:20:39 +08:00
zrguo
fc7a0329df Update README 2025-06-26 16:17:00 +08:00
zrguo
145e3a238b Update README.md 2025-06-26 16:12:20 +08:00
zrguo
1d788c3e97 Update RAGAnything related 2025-06-26 16:08:14 +08:00
zrguo
c947b20bb1 Update README.md 2025-06-22 16:43:18 +08:00
zrguo
4937de8809 Update 2025-06-22 15:12:09 +08:00
zrguo
d1aeb291d6 Update README.md 2025-06-19 17:01:21 +08:00
zrguo
bc70e6066c
Merge pull request #1671 from Chaoyingz/main
Fix incorrect spacing
2025-06-19 14:17:52 +08:00
chaohuang-ai
a408465602
Update README.md 2025-06-17 10:39:30 +08:00
chaohuang-ai
caf0411889
Update README.md 2025-06-17 10:38:34 +08:00
zrguo
03dd99912d RAG-Anything Integration 2025-06-17 01:16:02 +08:00
Chaoying
b8a2598404
Fix incorrect spacing 2025-06-11 10:59:40 +08:00
zrguo
75d13cc387 fix lint 2025-06-09 09:11:50 +08:00
neno-is-ooo
199869f45c docs: Add clear initialization requirements and troubleshooting section
- Add prominent warning about required initialization steps
- Document common errors (AttributeError: __aenter__ and KeyError: 'history_messages')
- Add troubleshooting section with specific solutions
- Add inline comments in code example highlighting initialization requirements

This addresses user confusion when LightRAG fails with cryptic errors due to
missing initialization calls. The documentation now clearly states that both
await rag.initialize_storages() and await initialize_pipeline_status() must
be called after creating a LightRAG instance.
2025-06-08 12:43:17 +02:00
chaohuang-ai
20c05a7e77
Update README.md 2025-06-07 00:58:56 +08:00