Commit graph

2833 commits

Author SHA1 Message Date
Daniel.y
444593bda8
Merge pull request #1878 from Ja1aia/main
fix timeout issue
2025-07-30 09:19:46 +08:00
yangdx
29e829113b Fix status key serialization issue in get_rack_status 2025-07-30 04:45:48 +08:00
yangdx
30f71c8acf Remove _id field and improve index handling in MongoDB
- Remove MongoDB _id field from documents
- Improve index existence check and creation
2025-07-30 04:17:26 +08:00
yangdx
cfb7117dd6 Fix track_id missing for query in PostgreSQL 2025-07-30 03:44:20 +08:00
yangdx
5ec7eedf37 Bump api version to 0193 2025-07-30 03:11:44 +08:00
yangdx
faa59cac72 Update webui assets 2025-07-30 03:11:19 +08:00
yangdx
cbaede8455 Add ScanResponse type for scan endpoint in webui 2025-07-30 03:11:09 +08:00
yangdx
7207598fc4 Fix track_id bugs and add track_id to scanning response 2025-07-30 03:06:20 +08:00
yangdx
75de799353 Remove deprecated content field from doc status storage
- Remove content field from JSON storage
- Remove content field from MongoDB storage
- Remove content field from Redis storage
2025-07-30 01:00:06 +08:00
yangdx
3ef3b8e155 Update webui assets 2025-07-30 00:06:27 +08:00
yangdx
6f958d5aee feat: add metadata timestamps to document processing and update frontend compatibility
- Add metadata field to doc_status storage with Unix timestamps for processing start/end times
- Update frontend API types: error -> error_msg, add track_id and metadata support
- Add getTrackStatus API method for document tracking functionality
- Fix frontend DocumentManager to use error_msg field for proper error display
- Ensure full compatibility between backend metadata changes and frontend UI
2025-07-30 00:04:27 +08:00
yangdx
93afa7d8a7 feat: add processing time tracking to document status with metadata field
- Add metadata field to DocProcessingStatus with start_time and end_time tracking
- Record processing timestamps using Unix time format (seconds precision)
- Update all storage backends (JSON, MongoDB, Redis, PostgreSQL) for new field support
- Maintain backward compatibility with default values for existing data
- Add error_msg field for better error tracking during document processing
2025-07-29 23:42:33 +08:00
yangdx
7206c07468 Remove deprecated content field from doc status
- Drop content column from LIGHTRAG_DOC_STATUS
- Clean up doc status handling code
- Maintain backward compatibility
2025-07-29 23:19:36 +08:00
yangdx
1e1adcb64a Add index on track_id column in doc status table of PostgreSQL 2025-07-29 23:03:09 +08:00
yangdx
6014b9bf73 feat: add track_id support for document processing progress monitoring
- Add get_docs_by_track_id() method to all storage backends (MongoDB, PostgreSQL, Redis, JSON)
- Implement automatic track_id generation with upload_/insert_ prefixes
- Add /track_status/{track_id} API endpoint for frontend progress queries
- Create database indexes for efficient track_id lookups
- Enable real-time document processing status tracking across all storage types
2025-07-29 22:24:21 +08:00
yangdx
dafdf92715 Remove content fallback logic in get_docs_by_status from Redis 2025-07-29 19:13:07 +08:00
yangdx
40a4cacee0 Merge branch 'main' into remove-content-from-doc-status 2025-07-29 16:15:01 +08:00
yangdx
92bbb7a1b3 Remove content fallback and standardize doc status handling
- Remove content_summary fallback logic
- Standardize doc status processing
- Handle missing file_path consistently
2025-07-29 16:13:51 +08:00
yangdx
24c36d876c Remove content field from DocProcessingStatus, update MongoDB and PostgreSQL implementation 2025-07-29 14:52:45 +08:00
administrator
9c3e1505b5 fix timeout issue 2025-07-29 13:38:46 +07:00
yangdx
8274ed52d1 feat: separate document content from doc_status to improve performance
This optimization significantly improves doc_status query/update performance by avoiding large string operations during frequent status checks.
2025-07-29 14:20:07 +08:00
administrator
c26dfa33de Fix: corrected unterminated f-string in config.py 2025-07-29 11:21:23 +07:00
yangdx
9923821d75 refactor: Remove deprecated max_token_size from embedding configuration
This parameter is no longer used. Its removal simplifies the API and clarifies that token length management is handled by upstream text chunking logic rather than the embedding wrapper.
2025-07-29 10:49:35 +08:00
yangdx
f4c2dc327d Fix linting 2025-07-29 09:57:41 +08:00
yangdx
75d1b1e9f8 Update Ollama context length configuration
- Rename OLLAMA_NUM_CTX to OLLAMA_LLM_NUM_CTX
- Increase default context window size
- Add requirement for minimum context size
- Update documentation examples
2025-07-29 09:53:37 +08:00
yangdx
645f81f7c8 fixes a critical bug where Ollama options were not being applied correctly
`dict.update()` modifies the dictionary in-place and returns `None`.
2025-07-29 09:52:25 +08:00
Michele Comitini
bd94714b15 options needs to be passed to ollama client embed() method
Fix line length

Create binding_options.py

Remove test property

Add dynamic binding options to CLI and environment config

Automatically generate command-line arguments and environment variable
support for all LLM provider bindings using BindingOptions. Add sample
.env generation and extensible framework for new providers.

Add example option definitions and fix test arg check in OllamaOptions

Add options_dict method to BindingOptions for argument parsing

Add comprehensive Ollama binding configuration options

ruff formatting Apply ruff formatting to binding_options.py

Add Ollama separate options for embedding and LLM

Refactor Ollama binding options and fix class var handling

The changes improve how class variables are handled in binding options
and better organize the Ollama-specific options into LLM and embedding
subclasses.

Fix typo in arg test.

Rename cls parameter to klass to avoid keyword shadowing

Fix Ollama embedding binding name typo

Fix ollama embedder context param name

Split Ollama options into LLM and embedding configs with mixin base

Add Ollama option configuration to LLM and embeddings in lightrag_server

Update sample .env generation and environment handling

Conditionally add env vars and cmdline options only when ollama bindings
are used. Add example env file for Ollama binding options.
2025-07-28 12:05:40 +02:00
yangdx
ee53e43568 Update webui assets 2025-07-28 02:52:32 +08:00
yangdx
769f77ef8f Update webui assets 2025-07-28 02:26:07 +08:00
yangdx
98ac6fb3f0 Bump api version to 0192 2025-07-28 01:42:51 +08:00
yangdx
f2ffff063b feat: refactor ollama server configuration management
- Add ollama_server_infos attribute to LightRAG class with default initialization
- Move default values to constants.py for centralized configuration
- Refactor OllamaServerInfos class with property accessors and CLI support
- Update OllamaAPI to get configuration through rag object instead of direct import
- Add command line arguments for simulated model name and tag
- Fix type imports to avoid circular dependencies
2025-07-28 01:38:35 +08:00
yangdx
598eecd06d Refactor: Rename llm_model_max_token_size to summary_max_tokens
This commit renames the parameter 'llm_model_max_token_size' to 'summary_max_tokens' for better clarity, as it specifically controls the token limit for entity relation summaries.
2025-07-28 00:49:08 +08:00
yangdx
d0d57a45b6 feat: add environment variables to /health endpoint and centralize defaults
- Add 9 environment variables to /health endpoint configuration section
- Centralize default constants in lightrag/constants.py for consistency
- Update config.py to use centralized defaults for better maintainability
2025-07-28 00:30:56 +08:00
yangdx
9c4e98ec3b Unify entity extraction prompt between passes
- Disallow hallucinated info in descriptions
- Align reminder steps with main extraction
2025-07-27 23:06:55 +08:00
Daniel.y
4eef9f3778
Merge pull request #1845 from AkosLukacs/patch-2
Better prompt for entity description extraction to avoid hallucinations
2025-07-27 22:38:08 +08:00
yangdx
3951a44666 Revert file_path build method, built from related chunks 2025-07-27 21:56:20 +08:00
yangdx
d70c584d80 Bump api version to 0191 2025-07-27 21:24:53 +08:00
yangdx
f2d051eea5 Fix: Improve keyword extraction prompt for robust JSON output.
*   Emphasize strict JSON output in key extration prompt
*   Clean up prompt examples in key extration prompt
*   Log raw LLM response on JSON error
2025-07-27 21:10:47 +08:00
yangdx
3f5ade47cd Update README 2025-07-27 17:26:49 +08:00
yangdx
e09929b42e Refine rerank filtering log message for clarity 2025-07-27 16:57:38 +08:00
yangdx
f4bca7bfb2 Fix linting 2025-07-27 16:50:45 +08:00
yangdx
a9565d7379 feat: Skip rerank filtering when min_rerank_score is 0.0 2025-07-27 16:50:12 +08:00
yangdx
ebaff228aa feat: Add rerank score filtering with configurable threshold
- Add DEFAULT_MIN_RERANK_SCORE constant (default: 0.0)
- Add MIN_RERANK_SCORE environment variable support
- Filter chunks with rerank scores below threshold in process_chunks_unified
- Add info-level logging for filtering operations
- Handle empty results gracefully after filtering
- Maintain backward compatibility with non-reranked chunks
2025-07-27 16:37:44 +08:00
yangdx
99e3812c38 refactor: unify file_path handling across merge and rebuild functions
- Replace simple string concatenation with build_file_path() in:
  - _merge_edges_then_upsert
  - _rebuild_single_entity
  - _rebuild_single_relationship
- Ensures consistent deduplication, length limiting, and error handling
- Aligns with existing _merge_nodes_then_upsert implementation
2025-07-27 12:37:24 +08:00
yangdx
cf1ca39b3f Refine entity continuation prompt to avoid duplicates.
- Clarify finding missing entities
- Instruct not to repeat extractions
2025-07-27 10:48:29 +08:00
yangdx
0dfbce0bb4 Update the README to clarify the explanation of concurrent processes. 2025-07-27 10:39:28 +08:00
yangdx
055629d30d Reduce default max total tokens to 30k 2025-07-27 10:33:06 +08:00
yangdx
a67f93acc9 Replace hardcoded max tokens with DEFAULT_MAX_TOTAL_TOKENS constant
- Use constant in process_chunks_unified
- Update WebUI default to match (32000)
2025-07-26 11:23:54 +08:00
yangdx
7b915b34f6 Refactor: move build_file_path function from operate.py to utils.py 2025-07-26 10:52:59 +08:00
yangdx
c8c3545454 refactor: extract file path length limit to shared constant
• Add DEFAULT_MAX_FILE_PATH_LENGTH constant
• Replace hardcoded 4090 in Milvus impl
2025-07-26 10:45:03 +08:00