Commit graph

343 commits

Author SHA1 Message Date
yangdx
3c85e4882c Update README 2025-11-20 10:50:02 +08:00
yangdx
cdd53ee875 Remove manual initialize_pipeline_status() calls across codebase
- Auto-init pipeline status in storages
- Remove redundant import statements
- Simplify initialization pattern
- Update docs and examples
2025-11-17 12:54:33 +08:00
yangdx
c12bc372dc Update README 2025-11-09 04:35:41 +08:00
yangdx
7bc6ccea19 Add uv package manager support to installation docs 2025-11-09 04:31:07 +08:00
yangdx
831e658ed8 Update readme 2025-11-06 16:26:07 +08:00
yangdx
5f49cee20f Merge branch 'main' into VOXWAVE-FOUNDRY/main 2025-11-06 15:37:35 +08:00
yangdx
b0d44d283b Add Langfuse observability integration documentation 2025-11-06 10:24:15 +08:00
yangdx
d803df9413 Fix linting 2025-11-05 17:19:58 +08:00
yangdx
451257aed5 Doc: Update news with recent features 2025-11-05 16:58:20 +08:00
yangdx
f610fdaf9b Merge branch 'main' into Anush008/main 2025-10-30 11:07:39 +08:00
yangdx
8af8bd80d2 docs: add frontend build steps to server installation guide 2025-10-29 21:54:47 +08:00
yangdx
14a015d4ad Restore query generation example and fix README path reference
• Fix path from example/ to examples/
• Add generate_query.py implementation
2025-10-29 19:11:40 +08:00
Anush008
8584980e3a
refactor: Qdrant Multi-tenancy (Include staged)
Signed-off-by: Anush008 <anushshetty90@gmail.com>
2025-10-26 09:58:24 +05:30
Humphry
0b3d31507e extended to use gemini, sswitched to use gemini-flash-latest 2025-10-20 13:17:16 +03:00
yangdx
a5c05f1b92 Add offline deployment support with cache management and layered deps
• Add tiktoken cache downloader CLI
• Add layered offline dependencies
• Add offline requirements files
• Add offline deployment guide
2025-10-11 10:28:14 +08:00
yangdx
1bf802eebf Add AGENTS.md documentation section for AI coding agent guidance 2025-10-10 12:21:35 +08:00
yangdx
2ce6a022ac Fix documentation for user_prompt parameter in QueryParam 2025-09-27 23:41:17 +08:00
yangdx
699ca3ba00 Remove deprecated history_turns and ids parameters from query API endpoint
• Update QueryParam documentation
• Mark history_turns as deprecated
• Clean up splash screen display
• Clarify conversation_history usage
2025-09-25 04:58:57 +08:00
yangdx
7b371309dd Update README 2025-09-15 12:31:39 +08:00
yangdx
4e751e0653 refac: Enhance extraction with improved prompts and parser
-   **Prompts**: Restructured prompts with clearer steps and quality guidelines. Simplified the relationship tuple by removing `relationship_strength`
-   **Model**: Updated default entity types to be more comprehensive and consistently capitalized (e.g., `Location`, `Product`)
2025-08-31 22:24:11 +08:00
yangdx
de2daf6565 refac: Rename summary_max_tokens to summary_context_size, comprehensive parameter validation for summary configuration
- Update algorithm logic in operate.py for better token management
- Fix health endpoint to use correct parameter names
2025-08-26 01:35:50 +08:00
yangdx
49ea9a79a7 Update rerank doc in README 2025-08-23 23:06:10 +08:00
yangdx
16a1ef1178 Update summary_max_tokens default from 10k to 30k tokens 2025-08-21 23:16:07 +08:00
yangdx
8c6b5f4a3a Update README 2025-08-21 18:14:27 +08:00
yangdx
62cdc7d7eb Update documentation with LLM selection guidelines and API improvements 2025-08-21 13:59:14 +08:00
yangdx
0e67ead8fa Rename MAX_TOKENS to SUMMARY_MAX_TOKENS for clarity 2025-08-21 10:15:20 +08:00
yangdx
d5e8f1e860 Update default query parameters for better performance
- Increase chunk_top_k from 10 to 20
- Reduce max_entity_tokens to 6000
- Reduce max_relation_tokens to 8000
- Update web UI default values
- Fix max_total_tokens to 30000
2025-08-18 19:32:11 +08:00
yangdx
dc7a6e1c5b Update README 2025-08-16 06:15:27 +08:00
yangdx
0b5c708660 Update storage implementation documentation
- Add detailed storage type descriptions
- Remove Chroma from vector storage options
- Include recommended PostgreSQL version
- Add Memgraph to graph storage options
- Update performance comparison notes
2025-08-05 18:03:51 +08:00
yangdx
32af45ff46 refactor: improve JSON parsing reliability with json-repair library
Replace regex-based JSON extraction with json-repair for better handling of malformed LLM responses. Remove deprecated JSON parsing utilities and clean up keyword_extraction parameter across LLM providers.

- Remove locate_json_string_body_from_string() and convert_response_to_json()
- Use json-repair.loads() in extract_keywords_only() for robust parsing
- Clean up LLM interfaces and remove unused parameters
- Add json-repair dependency
2025-08-01 19:36:20 +08:00
yangdx
3c530b21b6 Update README 2025-07-31 13:00:09 +08:00
yangdx
c6bd9f0329 Disable conversation history by default
- Set default history_turns to 0
- Mark history_turns as deprecated
- Remove history_turns from example
- Update documentation comments
2025-07-31 12:28:42 +08:00
yangdx
aba46213a7 Update README 2025-07-30 13:13:59 +08:00
yangdx
9923821d75 refactor: Remove deprecated max_token_size from embedding configuration
This parameter is no longer used. Its removal simplifies the API and clarifies that token length management is handled by upstream text chunking logic rather than the embedding wrapper.
2025-07-29 10:49:35 +08:00
yangdx
598eecd06d Refactor: Rename llm_model_max_token_size to summary_max_tokens
This commit renames the parameter 'llm_model_max_token_size' to 'summary_max_tokens' for better clarity, as it specifically controls the token limit for entity relation summaries.
2025-07-28 00:49:08 +08:00
Ákos Lukács
f115661e16
Fix "A Simple Program" example in README.md
The example should use ainsert and aquery. Fixes #1723
2025-07-22 14:37:15 +02:00
yangdx
80f7e37168 Fix default workspace name for PostgreSQL AGE graph storage 2025-07-16 19:16:22 +08:00
yangdx
1c53c5c764 Update README.md 2025-07-16 11:10:56 +08:00
yangdx
47341d3a71 Merge branch 'main' into rerank 2025-07-15 16:12:33 +08:00
yangdx
e8e1f6ab56 feat: centralize environment variable defaults in constants.py 2025-07-15 16:11:50 +08:00
yangdx
ccc2a20071 feat: remove deprecated MAX_TOKEN_SUMMARY parameter to prevent LLM output truncation
- Remove MAX_TOKEN_SUMMARY parameter and related configurations
- Eliminate forced token-based truncation in entity/relationship descriptions
- Switch to fragment-count based summarization logic using FORCE_LLM_SUMMARY_ON_MERGE
- Update FORCE_LLM_SUMMARY_ON_MERGE default from 6 to 4 for better summarization
- Clean up documentation, environment examples, and API display code
- Preserve backward compatibility by graceful parameter removal

This change resolves issues where LLMs were forcibly truncating entity relationship
descriptions mid-sentence, leading to incomplete and potentially inaccurate knowledge
graph content. The new approach allows LLMs to generate complete descriptions while
still providing summarization when multiple fragments need to be merged.

Breaking Change: None - parameter removal is backward compatible
Fixes: Entity relationship description truncation issues
2025-07-15 12:26:33 +08:00
zrguo
7c882313bb remove chunk_rerank_top_k 2025-07-15 11:52:34 +08:00
zrguo
4e425b1b59 Revert "update from main"
This reverts commit 1d0376d6a9.
2025-07-14 16:29:00 +08:00
zrguo
1d0376d6a9 update from main 2025-07-14 16:27:49 +08:00
zrguo
c9cbd2d3e0 Merge branch 'main' into rerank 2025-07-14 16:24:29 +08:00
zrguo
ef2115d437 Update token limit 2025-07-14 15:53:48 +08:00
yangdx
b03bb48e24 feat: Refine summary logic and add dedicated Ollama num_ctx config
- Refactor the trigger condition for LLM-based summarization of entities and relations. Instead of relying on character length, the summary is now triggered when the number of merged description fragments exceeds a configured threshold. This provides a more robust and logical condition for consolidation.
- Introduce the `OLLAMA_NUM_CTX` environment variable to explicitly configure the context window size (`num_ctx`) for Ollama models. This decouples the model's context length from the `MAX_TOKENS` parameter, which is now specifically used to limit input for summary generation, making the configuration clearer and more flexible.
- Updated `README` files, `env.example`, and default values to reflect these changes.
2025-07-14 01:55:04 +08:00
yangdx
9aa2ed0837 Merge branch 'main' into rerank 2025-07-09 15:33:39 +08:00
yangdx
e457374224 Fix linting 2025-07-09 15:33:05 +08:00
yangdx
bfa0844ecb Update README 2025-07-09 15:17:05 +08:00