Commit graph

54 commits

Author SHA1 Message Date
yangdx
ccc2a20071 feat: remove deprecated MAX_TOKEN_SUMMARY parameter to prevent LLM output truncation
- Remove MAX_TOKEN_SUMMARY parameter and related configurations
- Eliminate forced token-based truncation in entity/relationship descriptions
- Switch to fragment-count based summarization logic using FORCE_LLM_SUMMARY_ON_MERGE
- Update FORCE_LLM_SUMMARY_ON_MERGE default from 6 to 4 for better summarization
- Clean up documentation, environment examples, and API display code
- Preserve backward compatibility by graceful parameter removal

This change resolves issues where LLMs were forcibly truncating entity relationship
descriptions mid-sentence, leading to incomplete and potentially inaccurate knowledge
graph content. The new approach allows LLMs to generate complete descriptions while
still providing summarization when multiple fragments need to be merged.

Breaking Change: None - parameter removal is backward compatible
Fixes: Entity relationship description truncation issues
2025-07-15 12:26:33 +08:00
yangdx
b03bb48e24 feat: Refine summary logic and add dedicated Ollama num_ctx config
- Refactor the trigger condition for LLM-based summarization of entities and relations. Instead of relying on character length, the summary is now triggered when the number of merged description fragments exceeds a configured threshold. This provides a more robust and logical condition for consolidation.
- Introduce the `OLLAMA_NUM_CTX` environment variable to explicitly configure the context window size (`num_ctx`) for Ollama models. This decouples the model's context length from the `MAX_TOKENS` parameter, which is now specifically used to limit input for summary generation, making the configuration clearer and more flexible.
- Updated `README` files, `env.example`, and default values to reflect these changes.
2025-07-14 01:55:04 +08:00
yangdx
9aa2ed0837 Merge branch 'main' into rerank 2025-07-09 15:33:39 +08:00
yangdx
e457374224 Fix linting 2025-07-09 15:33:05 +08:00
yangdx
bfa0844ecb Update README 2025-07-09 15:17:05 +08:00
yangdx
5d4484882a Merge branch 'main' into rerank 2025-07-09 03:59:04 +08:00
zrguo
04a57445da update chunks truncation method 2025-07-08 13:31:05 +08:00
frankj
9a9674d590
Fix incorrect file path (404 Not Found)
Issue Description
A 404 error occurred when accessing the repository link pointing to README_zh.md. Upon inspection, the actual file path is README-zh.md, indicating an incorrect path reference in the original link.

Fix Details
Corrected the broken link from README_zh.md to the correct path README-zh.md.

Verification Method
After modification, the target file opens normally in the browser.

Hope this fix helps users access the Chinese documentation properly—thanks for the review!
2025-07-08 10:24:19 +08:00
yangdx
f417118e27 Center banner text dynamically 2025-07-07 17:28:59 +08:00
yangdx
13b2c93eec Update README.md 2025-07-07 16:44:30 +08:00
yangdx
bdfd2d53c7 Fix linting 2025-07-05 11:43:45 +08:00
yangdx
2e2b9f3b48 Refactor setup.py to utilize pyproject.toml for project installation. 2025-07-05 11:19:00 +08:00
zrguo
5b6ac84cd0 Update README 2025-06-26 18:08:21 +08:00
zrguo
afba3df01a Update README 2025-06-26 16:20:39 +08:00
zrguo
fc7a0329df Update README 2025-06-26 16:17:00 +08:00
zrguo
4937de8809 Update 2025-06-22 15:12:09 +08:00
zrguo
03dd99912d RAG-Anything Integration 2025-06-17 01:16:02 +08:00
zrguo
dc97b2b84f Update README.md 2025-06-05 17:51:04 +08:00
zrguo
8a726f6e08 MinerU integration 2025-06-05 17:02:48 +08:00
yangdx
e26a013fc3 Fix linting 2025-05-16 09:28:17 +08:00
yangdx
0a613208c1 Update README 2025-05-16 09:28:08 +08:00
yangdx
b9c25dfeb0 Update README 2025-05-14 14:42:52 +08:00
yangdx
db125c3764 Update README 2025-05-14 11:29:46 +08:00
yangdx
b836d02cac Optimize Ollama LLM driver 2025-05-14 01:13:03 +08:00
yangdx
5a3bf5ecc8 Fix linting 2025-05-11 10:25:59 +08:00
yangdx
4e1caf1e40 Fix lingting 2025-05-09 10:43:37 +08:00
yangdx
11fa70f7d1 Update README.md 2025-05-09 10:43:19 +08:00
yangdx
4a03218450 Update README.md 2025-05-08 05:26:59 +08:00
yangdx
9aedf1b38a Update README for QueryParam description 2025-05-05 18:30:49 +08:00
yangdx
7ccc3ffdd7 Fix linting 2025-04-30 14:10:50 +08:00
yangdx
946e55115d Update README 2025-04-30 11:07:38 +08:00
yangdx
c48ead6d8e Update README 2025-04-23 09:43:24 +08:00
yangdx
6716e19d5c Fix linting 2025-04-21 01:22:23 +08:00
yangdx
bd18c9c8ad Update sample code in README.md 2025-04-21 01:22:04 +08:00
yangdx
908953924a Update README 2025-04-21 00:25:26 +08:00
yangdx
dd4f92dae2 Update README.md 2025-04-20 20:33:01 +08:00
yangdx
39540f3f8b Fix linting 2025-04-20 14:33:33 +08:00
yangdx
5f2cd871a8 Update sample code and README 2025-04-20 14:33:16 +08:00
yangdx
48e49fbe34 Merge branch 'drahnreb/add-custom-tokenizer' 2025-04-20 12:22:10 +08:00
yangdx
ea1760c0f6 Update README 2025-04-19 15:59:10 +08:00
drahnreb
20ba1eb9c2 add: to optionally replace default tiktoken Tokenizer with a custom one 2025-04-18 16:24:43 +02:00
yangdx
247be483eb Merge branch 'main' into clear-doc 2025-04-04 05:45:06 +08:00
yangdx
df07c2a8b1 Remove Gremlin storage implementaion 2025-04-02 14:43:53 +08:00
yangdx
013be621d5 Remove TiDB storage implementaion 2025-04-02 14:40:27 +08:00
choizhang
ad1d362865 docs: Add Token Statistics Function Description in README 2025-04-01 23:50:14 +08:00
yangdx
1e31b26cbe Remove Oracle storage implementation 2025-04-01 18:15:29 +08:00
yangdx
4dcf717e53 Update README.md 2025-03-25 18:53:17 +08:00
yangdx
77889d7846 Update README 2025-03-25 18:50:01 +08:00
yangdx
373484e253 Fix linting 2025-03-25 16:29:55 +08:00
yangdx
48ddfb047e Fix linting 2025-03-25 16:29:37 +08:00