Commit graph

7 commits

Author SHA1 Message Date
hsparks-codes
075d0e2230
Merge 0bf86c7a56 into fd7e55b23d 2025-12-10 16:21:23 +11:00
buua436
af1344033d
Delete:remove unused tests (#11749)
### What problem does this PR solve?

change:
remove unused tests
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-04 18:49:32 +08:00
hsparks.codes
c81ca967e6 test: Add integration tests and explain testing strategy
Response to @KevinHuSh's review question about mocks.

Added:
- Integration tests (10 tests) with real CheckpointService and database
- Documentation explaining unit tests vs integration tests
- Real-world resume scenario test
- Comments in unit tests explaining mock usage

Integration tests cover:
- Actual database operations
- Complete checkpoint lifecycle
- Resume from crash scenario
- Retry logic with real state
- Progress calculation with persistence

Unit tests (mocked) remain for:
- Fast CI/CD feedback (0.04s)
- Interface validation
- No database dependencies

Both test types are valuable and complement each other.
2025-12-03 10:38:13 +01:00
hsparks-codes
23d1a9f05b
Merge branch 'main' into feature/checkpoint-resume 2025-12-03 04:14:36 -05:00
hsparks-codes
237a66913b
Feat: RAG evaluation (#11674)
### What problem does this PR solve?

Feature: This PR implements a comprehensive RAG evaluation framework to
address issue #11656.

**Problem**: Developers using RAGFlow lack systematic ways to measure
RAG accuracy and quality. They cannot objectively answer:
1. Are RAG results truly accurate?
2. How should configurations be adjusted to improve quality?
3. How to maintain and improve RAG performance over time?

**Solution**: This PR adds a complete evaluation system with:
- **Dataset & test case management** - Create ground truth datasets with
questions and expected answers
- **Automated evaluation** - Run RAG pipeline on test cases and compute
metrics
- **Comprehensive metrics** - Precision, recall, F1 score, MRR, hit rate
for retrieval quality
- **Smart recommendations** - Analyze results and suggest specific
configuration improvements (e.g., "increase top_k", "enable reranking")
- **20+ REST API endpoints** - Full CRUD operations for datasets, test
cases, and evaluation runs

**Impact**: Enables developers to objectively measure RAG quality,
identify issues, and systematically improve their RAG systems through
data-driven configuration tuning.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-12-03 17:00:58 +08:00
hsparks.codes
b293dc691d fix: Remove unused imports and variables in checkpoint tests
- Remove unused MagicMock import
- Remove unused datetime import
- Remove unused checkpoint variables in integration tests
- All 22 tests still passing
- Ruff linting now passes
2025-12-03 09:33:51 +01:00
hsparks.codes
4c6eecaa46 feat: Add API endpoints and comprehensive tests (Phase 3 & 4)
Phase 3 - API Endpoints:
- Create task_app.py with 5 REST API endpoints
  - POST /api/v1/task/{task_id}/pause - Pause running task
  - POST /api/v1/task/{task_id}/resume - Resume paused task
  - POST /api/v1/task/{task_id}/cancel - Cancel task
  - GET /api/v1/task/{task_id}/checkpoint-status - Get detailed status
  - POST /api/v1/task/{task_id}/retry-failed - Retry failed documents
- Full error handling and validation
- Proper authentication with @login_required
- Comprehensive logging

Phase 4 - Testing:
- Create test_checkpoint_service.py with 22 unit tests
- Test coverage:
   Checkpoint creation (2 tests)
   Document state management (4 tests)
   Pause/resume/cancel operations (5 tests)
   Retry logic (3 tests)
   Progress tracking (2 tests)
   Integration scenarios (3 tests)
   Edge cases (3 tests)
- All 22 tests passing 

Documentation:
- Usage examples and API documentation
- Performance impact analysis
2025-12-03 09:19:26 +01:00