ragflow/rag
alulala d9266ed65a
Fix: incorrect total chunks count in retrieval function after similarity filtering (#6741) (#6932)
### Related Issue:
https://github.com/infiniflow/ragflow/issues/6741

### Environment:
Using nightly version
Commit version:
[[6051abb](6051abb4a3)]

### Bug Description:
The retrieval function in rag/nlp/search.py returns the original total
chunks number
even after chunks are filtered by similarity_threshold. This creates
inconsistency
between the actual returned chunks and the reported total.

### Changes Made:
Added code to count how many search results actually meet or exceed the
configured similarity threshold
Positioned the calculation after the doc_ids conditional logic to ensure
special cases are handled correctly
Updated the ranks["total"] value to store this filtered count instead of
using the raw search result count
Using NumPy leverages optimized C-level batch operations to optimize
speed
2025-04-11 12:31:36 +08:00
..
app Fix: docx image exceptions. (#6839) 2025-04-07 12:33:34 +08:00
llm Fix: local variable referenced before assignment (#6909) 2025-04-09 20:29:12 +08:00
nlp Fix: incorrect total chunks count in retrieval function after similarity filtering (#6741) (#6932) 2025-04-11 12:31:36 +08:00
res Format file format from Windows/dos to Unix (#1949) 2024-08-15 09:17:36 +08:00
svr Optimize graphrag again (#6513) 2025-03-26 15:34:42 +08:00
utils Fix set_graph on non-existing edge (#6777) 2025-04-03 11:09:04 +08:00
__init__.py Update comments (#4569) 2025-01-21 20:52:28 +08:00
benchmark.py Refactor embedding batch_size (#3825) 2024-12-03 16:22:39 +08:00
prompts.py Added similarity scores in reference chunks (#6918) 2025-04-10 19:17:45 +08:00
raptor.py Refactor graphrag to remove redis lock (#5828) 2025-03-10 15:15:06 +08:00
settings.py Fix: optimize setting config initialization to resolve Minio initialization error (#6282) 2025-03-20 10:45:40 +08:00