ragflow/rag/nlp
alulala d9266ed65a
Fix: incorrect total chunks count in retrieval function after similarity filtering (#6741) (#6932)
### Related Issue:
https://github.com/infiniflow/ragflow/issues/6741

### Environment:
Using nightly version
Commit version:
[[6051abb](6051abb4a3)]

### Bug Description:
The retrieval function in rag/nlp/search.py returns the original total
chunks number
even after chunks are filtered by similarity_threshold. This creates
inconsistency
between the actual returned chunks and the reported total.

### Changes Made:
Added code to count how many search results actually meet or exceed the
configured similarity threshold
Positioned the calculation after the doc_ids conditional logic to ensure
special cases are handled correctly
Updated the ranks["total"] value to store this filtered count instead of
using the raw search result count
Using NumPy leverages optimized C-level batch operations to optimize
speed
2025-04-11 12:31:36 +08:00
..
__init__.py Feat: text file support position retaining. (#6231) 2025-03-18 16:55:11 +08:00
query.py Refa: token similarity calculations. (#6614) 2025-03-28 09:33:08 +08:00
rag_tokenizer.py Fix infinite recursion in RagTokenizer when processing repetitive characters (#6109) 2025-04-01 13:59:52 +08:00
search.py Fix: incorrect total chunks count in retrieval function after similarity filtering (#6741) (#6932) 2025-04-11 12:31:36 +08:00
surname.py
synonym.py Fix too many clause while searching. (#5119) 2025-02-19 13:18:39 +08:00
term_weight.py Fix errors detected by Ruff (#3918) 2024-12-08 14:21:12 +08:00