ragflow/agent/tools
hsparks.codes d104f59e29 feat: Implement hierarchical retrieval architecture (#11610)
This PR implements the complete three-tier hierarchical retrieval architecture
as specified in issue #11610, enabling production-grade RAG capabilities.

## Tier 1: Knowledge Base Routing
- Auto-route queries to relevant knowledge bases
- Per-KB retrieval parameters (KBRetrievalParams dataclass)
- Rule-based routing with keyword overlap scoring
- LLM-based routing with fallback to rule-based
- Configurable routing methods: auto, rule_based, llm_based, all

## Tier 2: Document Filtering
- Document-level metadata filtering within selected KBs
- Configurable metadata fields for filtering
- LLM-generated filter conditions
- Metadata similarity matching (fuzzy matching)
- Enhanced metadata generation for documents

## Tier 3: Chunk Refinement
- Parent-child chunking with summary mapping
- Custom prompts for keyword extraction
- LLM-based question generation for chunks
- Integration with existing retrieval pipeline

## Metadata Management (Batch CRUD)
- MetadataService with batch operations:
  - batch_get_metadata
  - batch_update_metadata
  - batch_delete_metadata_fields
  - batch_set_metadata_field
  - get_metadata_schema
  - search_by_metadata
  - get_metadata_statistics
  - copy_metadata
- REST API endpoints in metadata_app.py

## Integration
- HierarchicalConfig dataclass for configuration
- Integrated into Dealer class (search.py)
- Wired into agent retrieval tool
- Non-breaking: disabled by default

## Tests
- 48 unit tests covering all components
- Tests for config, routing, filtering, and metadata operations
2025-12-09 07:32:00 +01:00
..
__init__.py Feat: Add thought info to every component. (#9134) 2025-07-31 15:13:45 +08:00
akshare.py
arxiv.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
base.py Refa: cleanup synchronous functions in agent_with_tools (#11736) 2025-12-04 14:15:05 +08:00
code_exec.py Fix typos (#11607) 2025-12-01 09:49:46 +08:00
crawler.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
deepl.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
duckduckgo.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
email.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
exesql.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
github.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
google.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
googlescholar.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
jin10.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
pubmed.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
qweather.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
retrieval.py feat: Implement hierarchical retrieval architecture (#11610) 2025-12-09 07:32:00 +01:00
searxng.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
tavily.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
tushare.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
wencai.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
wikipedia.py Feat: add mechanism to check cancellation in Agent (#10766) 2025-11-11 17:36:48 +08:00
yahoofinance.py Fix errors (#11804) 2025-12-08 12:21:18 +08:00