Docling Ingestion
OpenRAG uses Docling for its document ingestion pipeline.
-More specifically, OpenRAG uses Docling Serve, which starts a docling-serve process on your local machine and runs Docling ingestion through an API service.
docling serve process on your local machine and runs Docling ingestion through an API service.
Docling ingests documents from your local machine or OAuth connectors, splits them into chunks, and stores them as separate, structured documents in the OpenSearch documents index.
OpenRAG chose Docling for its support for a wide variety of file formats, high performance, and advanced understanding of tables and images.
Docling ingestion settings
These settings configure the Docling ingestion parameters.
-OpenRAG will warn you if docling-serve is not running.
-To start or stop docling-serve or any other native services, in the TUI main menu, click Start Native Services or Stop Native Services.
OpenRAG will warn you if docling serve is not running.
+To start or stop docling serve or any other native services, in the TUI main menu, click Start Native Services or Stop Native Services.
Embedding model determines which AI model is used to create vector embeddings. The default is text-embedding-3-small.
Chunk size determines how large each text chunk is in number of characters. Larger chunks yield more context per chunk, but may include irrelevant information. Smaller chunks yield more precise semantic search, but may lack context. diff --git a/install/index.html b/install/index.html index 1bc33b61..29bbe62f 100644 --- a/install/index.html +++ b/install/index.html @@ -4,7 +4,7 @@