ragflow/personal_analyze/01-API-LAYER
Claude a6ee18476d
docs: Add detailed backend module analysis documentation
Add comprehensive documentation covering 6 modules:
- 01-API-LAYER: Authentication, routing, SSE streaming
- 02-SERVICE-LAYER: Dialog, Task, LLM service analysis
- 03-RAG-ENGINE: Hybrid search, embedding, reranking
- 04-AGENT-SYSTEM: Canvas engine, components, tools
- 05-DOCUMENT-PROCESSING: Task executor, PDF parsing
- 06-ALGORITHMS: BM25, fusion, RAPTOR

Total 28 documentation files with code analysis, diagrams, and formulas.
2025-11-26 11:10:54 +00:00
..
authentication_flow.md docs: Add detailed backend module analysis documentation 2025-11-26 11:10:54 +00:00
canvas_app_analysis.md docs: Add detailed backend module analysis documentation 2025-11-26 11:10:54 +00:00
conversation_app_analysis.md docs: Add detailed backend module analysis documentation 2025-11-26 11:10:54 +00:00
document_app_analysis.md docs: Add detailed backend module analysis documentation 2025-11-26 11:10:54 +00:00
README.md docs: Add detailed backend module analysis documentation 2025-11-26 11:10:54 +00:00
request_lifecycle.md docs: Add detailed backend module analysis documentation 2025-11-26 11:10:54 +00:00

01-API-LAYER - API Gateway & Request Handling

Tổng Quan

API Layer là tầng xử lý HTTP requests của RAGFlow, được xây dựng trên Quart (async Flask-compatible framework) với kiến trúc Blueprint-based modular.

Kiến Trúc Tổng Quan

┌─────────────────────────────────────────────────────────────────────────┐
│                            CLIENT REQUEST                                │
│                    (Web App / SDK / External API)                        │
└────────────────────────────────┬────────────────────────────────────────┘
                                 │
                                 ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                           NGINX (Reverse Proxy)                          │
│                         Port 80/443 → Port 9380                          │
└────────────────────────────────┬────────────────────────────────────────┘
                                 │
                                 ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         QUART ASGI SERVER                                │
│                                                                          │
│  ┌─────────────────────────────────────────────────────────────────────┐│
│  │                        MIDDLEWARE STACK                              ││
│  │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐               ││
│  │  │  CORS   │→ │ Session │→ │  Auth   │→ │  JSON   │               ││
│  │  │ Handler │  │ Manager │  │  Check  │  │ Encoder │               ││
│  │  └─────────┘  └─────────┘  └─────────┘  └─────────┘               ││
│  └─────────────────────────────────────────────────────────────────────┘│
│                                 │                                        │
│  ┌─────────────────────────────┴─────────────────────────────────────┐  │
│  │                      BLUEPRINT ROUTER                              │  │
│  │                                                                    │  │
│  │  /api/v1/kb/*        → kb_app.py                                  │  │
│  │  /api/v1/document/*  → document_app.py                            │  │
│  │  /api/v1/dialog/*    → dialog_app.py (legacy)                     │  │
│  │  /v1/conversation/*  → conversation_app.py                        │  │
│  │  /v1/canvas/*        → canvas_app.py                              │  │
│  │  /api/v1/file/*      → file_app.py                                │  │
│  │  /v1/user/*          → user_app.py                                │  │
│  │  /v1/llm/*           → llm_app.py                                 │  │
│  │  /api/v1/sdk/*       → sdk/*.py                                   │  │
│  │                                                                    │  │
│  └────────────────────────────────────────────────────────────────────┘  │
│                                 │                                        │
└─────────────────────────────────┼────────────────────────────────────────┘
                                  │
                                  ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                          SERVICE LAYER                                   │
│              (DialogService, DocumentService, KBService, ...)            │
└─────────────────────────────────────────────────────────────────────────┘

Cấu Trúc Thư Mục

/api/
├── ragflow_server.py          # Entry point - Server initialization
├── apps/
│   ├── __init__.py            # Blueprint registration & middleware
│   ├── document_app.py        # Document upload/management (708 lines)
│   ├── conversation_app.py    # Chat/conversation API (419 lines)
│   ├── canvas_app.py          # Agent workflow API (609 lines)
│   ├── kb_app.py              # Knowledge base management (699 lines)
│   ├── file_app.py            # File operations (454 lines)
│   ├── dialog_app.py          # Legacy dialog API
│   ├── chunk_app.py           # Chunk management
│   ├── search_app.py          # Search operations
│   ├── llm_app.py             # LLM configuration
│   ├── user_app.py            # User management
│   ├── tenant_app.py          # Multi-tenancy
│   ├── system_app.py          # System configuration
│   ├── connector_app.py       # Data source connectors
│   ├── mcp_server_app.py      # MCP integration
│   ├── auth/                  # Authentication modules
│   │   ├── oauth.py           # OAuth base client
│   │   ├── github.py          # GitHub OAuth
│   │   └── oidc.py            # OpenID Connect
│   └── sdk/                   # SDK API endpoints
│       ├── dataset.py
│       ├── doc.py
│       ├── chat.py
│       └── ...
├── db/
│   ├── db_models.py           # Database models
│   └── services/              # Business logic services
└── utils/
    ├── api_utils.py           # API utilities
    └── validation.py          # Request validation

Files Trong Module Này

File Mô Tả
document_app_analysis.md Phân tích Document Upload/Management API
conversation_app_analysis.md Phân tích Chat/Conversation API với SSE
canvas_app_analysis.md Phân tích Agent Workflow API
authentication_flow.md Chi tiết JWT/OAuth authentication
request_lifecycle.md Lifecycle của HTTP request

Key Concepts

1. Application Factory Pattern

# /api/apps/__init__.py
app = Quart(__name__)
app = cors(app, allow_origin="*")
app.url_map.strict_slashes = False
app.json_encoder = CustomJSONEncoder
app.errorhandler(Exception)(server_error_response)

# Session configuration
app.config["SESSION_TYPE"] = "redis"
app.config["SESSION_REDIS"] = settings.decrypt_database_config(name="redis")
app.config["MAX_CONTENT_LENGTH"] = 1024 * 1024 * 1024  # 1GB max upload

2. Dynamic Blueprint Registration

def register_page(page_path):
    spec = spec_from_file_location(module_name, page_path)
    page = module_from_spec(spec)
    page.app = app
    page.manager = Blueprint(page_name, module_name)
    sys.modules[module_name] = page
    spec.loader.exec_module(page)

    url_prefix = f"/api/{API_VERSION}" if "sdk" in path else f"/{API_VERSION}/{page_name}"
    app.register_blueprint(page.manager, url_prefix=url_prefix)

3. Standard Response Format

def get_json_result(code: RetCode = RetCode.SUCCESS, message="success", data=None):
    return jsonify({"code": code, "message": message, "data": data})

# Response codes
class RetCode(IntEnum):
    SUCCESS = 0
    EXCEPTION_ERROR = 100
    ARGUMENT_ERROR = 101
    AUTHENTICATION_ERROR = 109
    UNAUTHORIZED = 401
    SERVER_ERROR = 500

Request Flow

1. HTTP Request arrives at Nginx
       ↓
2. Nginx forwards to Quart (port 9380)
       ↓
3. CORS middleware applies headers
       ↓
4. Session middleware loads user session from Redis
       ↓
5. _load_user() validates JWT/API token
       ↓
6. Blueprint router matches URL pattern
       ↓
7. @login_required decorator checks authentication
       ↓
8. @validate_request decorator validates parameters
       ↓
9. Route handler executes business logic
       ↓
10. Service layer processes request
       ↓
11. Response formatted via get_json_result()
       ↓
12. Response sent to client

API Endpoint Summary

Core APIs

Blueprint Base URL Key Endpoints
kb_app /api/v1/kb create, list, update, delete, detail
document_app /api/v1/document upload, run, list, rm, change_parser
conversation_app /v1/conversation completion (SSE), set, get, list
canvas_app /v1/canvas set, completion (SSE), debug, templates
file_app /api/v1/file upload, list, get, rm, mv

SDK APIs

Endpoint Method Purpose
/api/v1/dataset POST/GET Dataset CRUD
/api/v1/document POST/GET Document operations
/api/v1/chat POST Chat completions
/api/v1/session POST/GET Session management