diff --git a/docs/docs/get-started/what-is-openrag.mdx b/docs/docs/get-started/what-is-openrag.mdx index bae90d26..129d2df9 100644 --- a/docs/docs/get-started/what-is-openrag.mdx +++ b/docs/docs/get-started/what-is-openrag.mdx @@ -83,4 +83,43 @@ The **OpenRAG Backend** is the central orchestration service that coordinates al **Third Party Services** like **Google Drive** connect to the **OpenRAG Backend** through OAuth authentication, allowing synchronication of cloud storage with the OpenSearch knowledge base. -The **OpenRAG Frontend** provides the user interface for interacting with the system. \ No newline at end of file +The **OpenRAG Frontend** provides the user interface for interacting with the system. + +## Performance expectations + +On a local VM with 7 vCPUs and 8 GiB RAM, OpenRAG ingested approximately 5.03 GB across 1,083 files in about 42 minutes. +This equates to approximately 2.4 documents per second. + +You can generally expect equal or better performance on developer laptops and significantly faster on servers. +Throughput scales with CPU cores, memory, storage speed, and configuration choices such as embedding model, chunk size and overlap, and concurrency. + +This test returned 12 errors (approximately 1.1%). +All errors were file‑specific, and they didn't stop the pipeline. + +Ingestion dataset: + +* Total files: 1,083 items mounted +* Total size on disk: 5,026,474,862 bytes (approximately 5.03 GB) + +Hardware specifications: + +* Machine: Apple M4 Pro +* Podman VM: + * Name: `podman-machine-default` + * Type: `applehv` + * vCPUs: 7 + * Memory: 8 GiB + * Disk size: 100 GiB + +Test results: + +```text +2025-09-24T22:40:45.542190Z /app/src/main.py:231 Ingesting default documents when ready disable_langflow_ingest=False +2025-09-24T22:40:45.546385Z /app/src/main.py:270 Using Langflow ingestion pipeline for default documents file_count=1082 +... +2025-09-24T23:19:44.866365Z /app/src/main.py:351 Langflow ingestion completed success_count=1070 error_count=12 total_files=1082 +``` + +Elapsed time: ~42 minutes 15 seconds (2,535 seconds) + +Throughput: ~2.4 documents/second \ No newline at end of file