LinhKhanh
|
146eadc89c
|
Analyze dialog_service.py code
|
2025-12-01 05:06:53 +07:00 |
|
Claude
|
2f61760051
|
docs: Add document and knowledgebase service analysis documentation
- Add document_service_analysis.md: comprehensive analysis of document
lifecycle management including insert, remove, parse, progress tracking
- Add knowledgebase_service_analysis.md: dataset management and access
control analysis with permission model, parser configuration
|
2025-11-27 09:54:39 +00:00 |
|
Claude
|
1dcc9a870b
|
docs: Add detailed PDF parser processing steps documentation
Created comprehensive documentation for RAGFlowPdfParser processing pipeline:
- 10 major processing steps with code references
- Complete data flow diagrams
- Algorithm explanations (K-Means column detection, text merging)
- Box data structure evolution through pipeline
- Position tag format specification
- Line-by-line code analysis for key methods:
- __init__ (model loading)
- __images__ (OCR processing)
- _layouts_rec (layout detection)
- _table_transformer_job (table structure)
- _assign_column (column detection)
- _text_merge (horizontal merge)
- _naive_vertical_merge (vertical merge)
- _filter_forpages (cleanup)
- _extract_table_figure (extraction)
- __filterout_scraps (final output)
|
2025-11-27 06:29:12 +00:00 |
|
Claude
|
6d4dbbfe2c
|
docs: Add comprehensive DeepDoc deep guide documentation
Created in-depth documentation for understanding the deepdoc module:
- README.md: Complete deep guide with:
- Big picture explanation (what problem deepdoc solves)
- Data flow diagrams (Input → Processing → Output)
- Detailed code analysis with line numbers
- Technical explanations (ONNX, CTC, NMS, etc.)
- Design reasoning (why certain technologies chosen)
- Difficult terms glossary
- Extension examples
- ocr_deep_dive.md: Deep dive into OCR subsystem
- DBNet text detection architecture
- CRNN text recognition
- CTC decoding algorithm
- Rotation handling
- Performance optimization
- layout_table_deep_dive.md: Deep dive into layout/table recognition
- YOLOv10 layout detection
- Table structure recognition
- Grid construction algorithm
- Spanning cell handling
- HTML/descriptive output generation
|
2025-11-27 03:46:14 +00:00 |
|
Claude
|
566bce428b
|
docs: Add comprehensive algorithm documentation (50+ algorithms)
- Updated README.md with complete algorithm map across 12 categories
- Added clustering_algorithms.md (K-Means, GMM, UMAP, Silhouette, Node2Vec)
- Added graph_algorithms.md (PageRank, Leiden, Entity Extraction/Resolution)
- Added nlp_algorithms.md (Trie tokenization, TF-IDF, NER, POS, Synonym)
- Added vision_algorithms.md (OCR, Layout Recognition, TSR, NMS, IoU, XGBoost)
- Added similarity_metrics.md (Cosine, Edit Distance, Token, Hybrid)
|
2025-11-27 03:34:49 +00:00 |
|
Claude
|
a6ee18476d
|
docs: Add detailed backend module analysis documentation
Add comprehensive documentation covering 6 modules:
- 01-API-LAYER: Authentication, routing, SSE streaming
- 02-SERVICE-LAYER: Dialog, Task, LLM service analysis
- 03-RAG-ENGINE: Hybrid search, embedding, reranking
- 04-AGENT-SYSTEM: Canvas engine, components, tools
- 05-DOCUMENT-PROCESSING: Task executor, PDF parsing
- 06-ALGORITHMS: BM25, fusion, RAPTOR
Total 28 documentation files with code analysis, diagrams, and formulas.
|
2025-11-26 11:10:54 +00:00 |
|
Claude
|
c7cecf9a1f
|
docs: Add comprehensive RAGFlow analysis documentation
- Add directory structure analysis (01_directory_structure.md)
- Add system architecture with diagrams (02_system_architecture.md)
- Add sequence diagrams for main flows (03_sequence_diagrams.md)
- Add detailed modules analysis (04_modules_analysis.md)
- Add tech stack documentation (05_tech_stack.md)
- Add source code analysis (06_source_code_analysis.md)
- Add README summary for personal_analyze folder
This documentation provides:
- Complete codebase structure overview
- System architecture diagrams (ASCII art)
- Sequence diagrams for authentication, RAG, chat, agent flows
- Detailed analysis of API, RAG, DeepDoc, Agent, GraphRAG modules
- Full tech stack with 150+ dependencies analyzed
- Source code patterns and best practices analysis
|
2025-11-26 10:20:05 +00:00 |
|