Merge branch 'main' into multi-embedding-support
This commit is contained in:
commit
a611dd81a2
6 changed files with 252 additions and 13 deletions
|
|
@ -16,6 +16,8 @@ NUDGES_FLOW_ID=ebc01d31-1976-46ce-a385-b0240327226c
|
||||||
# Set a strong admin password for OpenSearch; a bcrypt hash is generated at
|
# Set a strong admin password for OpenSearch; a bcrypt hash is generated at
|
||||||
# container startup from this value. Do not commit real secrets.
|
# container startup from this value. Do not commit real secrets.
|
||||||
# must match the hashed password in secureconfig, must change for secure deployment!!!
|
# must match the hashed password in secureconfig, must change for secure deployment!!!
|
||||||
|
# NOTE: if you set this by hand, it must be a complex password:
|
||||||
|
The password must contain at least 8 characters, and must contain at least one uppercase letter, one lowercase letter, one digit, and one special character.
|
||||||
OPENSEARCH_PASSWORD=
|
OPENSEARCH_PASSWORD=
|
||||||
|
|
||||||
# make here https://console.cloud.google.com/apis/credentials
|
# make here https://console.cloud.google.com/apis/credentials
|
||||||
|
|
|
||||||
201
LICENSE
Normal file
201
LICENSE
Normal file
|
|
@ -0,0 +1,201 @@
|
||||||
|
Apache License
|
||||||
|
Version 2.0, January 2004
|
||||||
|
http://www.apache.org/licenses/
|
||||||
|
|
||||||
|
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||||
|
|
||||||
|
1. Definitions.
|
||||||
|
|
||||||
|
"License" shall mean the terms and conditions for use, reproduction,
|
||||||
|
and distribution as defined by Sections 1 through 9 of this document.
|
||||||
|
|
||||||
|
"Licensor" shall mean the copyright owner or entity authorized by
|
||||||
|
the copyright owner that is granting the License.
|
||||||
|
|
||||||
|
"Legal Entity" shall mean the union of the acting entity and all
|
||||||
|
other entities that control, are controlled by, or are under common
|
||||||
|
control with that entity. For the purposes of this definition,
|
||||||
|
"control" means (i) the power, direct or indirect, to cause the
|
||||||
|
direction or management of such entity, whether by contract or
|
||||||
|
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||||
|
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||||
|
|
||||||
|
"You" (or "Your") shall mean an individual or Legal Entity
|
||||||
|
exercising permissions granted by this License.
|
||||||
|
|
||||||
|
"Source" form shall mean the preferred form for making modifications,
|
||||||
|
including but not limited to software source code, documentation
|
||||||
|
source, and configuration files.
|
||||||
|
|
||||||
|
"Object" form shall mean any form resulting from mechanical
|
||||||
|
transformation or translation of a Source form, including but
|
||||||
|
not limited to compiled object code, generated documentation,
|
||||||
|
and conversions to other media types.
|
||||||
|
|
||||||
|
"Work" shall mean the work of authorship, whether in Source or
|
||||||
|
Object form, made available under the License, as indicated by a
|
||||||
|
copyright notice that is included in or attached to the work
|
||||||
|
(an example is provided in the Appendix below).
|
||||||
|
|
||||||
|
"Derivative Works" shall mean any work, whether in Source or Object
|
||||||
|
form, that is based on (or derived from) the Work and for which the
|
||||||
|
editorial revisions, annotations, elaborations, or other modifications
|
||||||
|
represent, as a whole, an original work of authorship. For the purposes
|
||||||
|
of this License, Derivative Works shall not include works that remain
|
||||||
|
separable from, or merely link (or bind by name) to the interfaces of,
|
||||||
|
the Work and Derivative Works thereof.
|
||||||
|
|
||||||
|
"Contribution" shall mean any work of authorship, including
|
||||||
|
the original version of the Work and any modifications or additions
|
||||||
|
to that Work or Derivative Works thereof, that is intentionally
|
||||||
|
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||||
|
or by an individual or Legal Entity authorized to submit on behalf of
|
||||||
|
the copyright owner. For the purposes of this definition, "submitted"
|
||||||
|
means any form of electronic, verbal, or written communication sent
|
||||||
|
to the Licensor or its representatives, including but not limited to
|
||||||
|
communication on electronic mailing lists, source code control systems,
|
||||||
|
and issue tracking systems that are managed by, or on behalf of, the
|
||||||
|
Licensor for the purpose of discussing and improving the Work, but
|
||||||
|
excluding communication that is conspicuously marked or otherwise
|
||||||
|
designated in writing by the copyright owner as "Not a Contribution."
|
||||||
|
|
||||||
|
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||||
|
on behalf of whom a Contribution has been received by Licensor and
|
||||||
|
subsequently incorporated within the Work.
|
||||||
|
|
||||||
|
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||||
|
this License, each Contributor hereby grants to You a perpetual,
|
||||||
|
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||||
|
copyright license to reproduce, prepare Derivative Works of,
|
||||||
|
publicly display, publicly perform, sublicense, and distribute the
|
||||||
|
Work and such Derivative Works in Source or Object form.
|
||||||
|
|
||||||
|
3. Grant of Patent License. Subject to the terms and conditions of
|
||||||
|
this License, each Contributor hereby grants to You a perpetual,
|
||||||
|
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||||
|
(except as stated in this section) patent license to make, have made,
|
||||||
|
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||||
|
where such license applies only to those patent claims licensable
|
||||||
|
by such Contributor that are necessarily infringed by their
|
||||||
|
Contribution(s) alone or by combination of their Contribution(s)
|
||||||
|
with the Work to which such Contribution(s) was submitted. If You
|
||||||
|
institute patent litigation against any entity (including a
|
||||||
|
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||||
|
or a Contribution incorporated within the Work constitutes direct
|
||||||
|
or contributory patent infringement, then any patent licenses
|
||||||
|
granted to You under this License for that Work shall terminate
|
||||||
|
as of the date such litigation is filed.
|
||||||
|
|
||||||
|
4. Redistribution. You may reproduce and distribute copies of the
|
||||||
|
Work or Derivative Works thereof in any medium, with or without
|
||||||
|
modifications, and in Source or Object form, provided that You
|
||||||
|
meet the following conditions:
|
||||||
|
|
||||||
|
(a) You must give any other recipients of the Work or
|
||||||
|
Derivative Works a copy of this License; and
|
||||||
|
|
||||||
|
(b) You must cause any modified files to carry prominent notices
|
||||||
|
stating that You changed the files; and
|
||||||
|
|
||||||
|
(c) You must retain, in the Source form of any Derivative Works
|
||||||
|
that You distribute, all copyright, patent, trademark, and
|
||||||
|
attribution notices from the Source form of the Work,
|
||||||
|
excluding those notices that do not pertain to any part of
|
||||||
|
the Derivative Works; and
|
||||||
|
|
||||||
|
(d) If the Work includes a "NOTICE" text file as part of its
|
||||||
|
distribution, then any Derivative Works that You distribute must
|
||||||
|
include a readable copy of the attribution notices contained
|
||||||
|
within such NOTICE file, excluding those notices that do not
|
||||||
|
pertain to any part of the Derivative Works, in at least one
|
||||||
|
of the following places: within a NOTICE text file distributed
|
||||||
|
as part of the Derivative Works; within the Source form or
|
||||||
|
documentation, if provided along with the Derivative Works; or,
|
||||||
|
within a display generated by the Derivative Works, if and
|
||||||
|
wherever such third-party notices normally appear. The contents
|
||||||
|
of the NOTICE file are for informational purposes only and
|
||||||
|
do not modify the License. You may add Your own attribution
|
||||||
|
notices within Derivative Works that You distribute, alongside
|
||||||
|
or as an addendum to the NOTICE text from the Work, provided
|
||||||
|
that such additional attribution notices cannot be construed
|
||||||
|
as modifying the License.
|
||||||
|
|
||||||
|
You may add Your own copyright statement to Your modifications and
|
||||||
|
may provide additional or different license terms and conditions
|
||||||
|
for use, reproduction, or distribution of Your modifications, or
|
||||||
|
for any such Derivative Works as a whole, provided Your use,
|
||||||
|
reproduction, and distribution of the Work otherwise complies with
|
||||||
|
the conditions stated in this License.
|
||||||
|
|
||||||
|
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||||
|
any Contribution intentionally submitted for inclusion in the Work
|
||||||
|
by You to the Licensor shall be under the terms and conditions of
|
||||||
|
this License, without any additional terms or conditions.
|
||||||
|
Notwithstanding the above, nothing herein shall supersede or modify
|
||||||
|
the terms of any separate license agreement you may have executed
|
||||||
|
with Licensor regarding such Contributions.
|
||||||
|
|
||||||
|
6. Trademarks. This License does not grant permission to use the trade
|
||||||
|
names, trademarks, service marks, or product names of the Licensor,
|
||||||
|
except as required for reasonable and customary use in describing the
|
||||||
|
origin of the Work and reproducing the content of the NOTICE file.
|
||||||
|
|
||||||
|
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||||
|
agreed to in writing, Licensor provides the Work (and each
|
||||||
|
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||||
|
implied, including, without limitation, any warranties or conditions
|
||||||
|
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||||
|
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||||
|
appropriateness of using or redistributing the Work and assume any
|
||||||
|
risks associated with Your exercise of permissions under this License.
|
||||||
|
|
||||||
|
8. Limitation of Liability. In no event and under no legal theory,
|
||||||
|
whether in tort (including negligence), contract, or otherwise,
|
||||||
|
unless required by applicable law (such as deliberate and grossly
|
||||||
|
negligent acts) or agreed to in writing, shall any Contributor be
|
||||||
|
liable to You for damages, including any direct, indirect, special,
|
||||||
|
incidental, or consequential damages of any character arising as a
|
||||||
|
result of this License or out of the use or inability to use the
|
||||||
|
Work (including but not limited to damages for loss of goodwill,
|
||||||
|
work stoppage, computer failure or malfunction, or any and all
|
||||||
|
other commercial damages or losses), even if such Contributor
|
||||||
|
has been advised of the possibility of such damages.
|
||||||
|
|
||||||
|
9. Accepting Warranty or Additional Liability. While redistributing
|
||||||
|
the Work or Derivative Works thereof, You may choose to offer,
|
||||||
|
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||||
|
or other liability obligations and/or rights consistent with this
|
||||||
|
License. However, in accepting such obligations, You may act only
|
||||||
|
on Your own behalf and on Your sole responsibility, not on behalf
|
||||||
|
of any other Contributor, and only if You agree to indemnify,
|
||||||
|
defend, and hold each Contributor harmless for any liability
|
||||||
|
incurred by, or claims asserted against, such Contributor by reason
|
||||||
|
of your accepting any such warranty or additional liability.
|
||||||
|
|
||||||
|
END OF TERMS AND CONDITIONS
|
||||||
|
|
||||||
|
APPENDIX: How to apply the Apache License to your work.
|
||||||
|
|
||||||
|
To apply the Apache License to your work, attach the following
|
||||||
|
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||||
|
replaced with your own identifying information. (Don't include
|
||||||
|
the brackets!) The text should be enclosed in the appropriate
|
||||||
|
comment syntax for the file format. We also recommend that a
|
||||||
|
file or class name and description of purpose be included on the
|
||||||
|
same "printed page" as the copyright notice for easier
|
||||||
|
identification within third-party archives.
|
||||||
|
|
||||||
|
Copyright 2025 IBM
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
you may not use this file except in compliance with the License.
|
||||||
|
You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software
|
||||||
|
distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
See the License for the specific language governing permissions and
|
||||||
|
limitations under the License.
|
||||||
|
|
@ -9,7 +9,7 @@ import TabItem from '@theme/TabItem';
|
||||||
import PartialModifyFlows from '@site/docs/_partial-modify-flows.mdx';
|
import PartialModifyFlows from '@site/docs/_partial-modify-flows.mdx';
|
||||||
|
|
||||||
OpenRAG uses [Docling](https://docling-project.github.io/docling/) for its document ingestion pipeline.
|
OpenRAG uses [Docling](https://docling-project.github.io/docling/) for its document ingestion pipeline.
|
||||||
More specifically, OpenRAG uses [Docling Serve](https://github.com/docling-project/docling-serve), which starts a `docling-serve` process on your local machine and runs Docling ingestion through an API service.
|
More specifically, OpenRAG uses [Docling Serve](https://github.com/docling-project/docling-serve), which starts a `docling serve` process on your local machine and runs Docling ingestion through an API service.
|
||||||
|
|
||||||
Docling ingests documents from your local machine or OAuth connectors, splits them into chunks, and stores them as separate, structured documents in the OpenSearch `documents` index.
|
Docling ingests documents from your local machine or OAuth connectors, splits them into chunks, and stores them as separate, structured documents in the OpenSearch `documents` index.
|
||||||
|
|
||||||
|
|
@ -19,8 +19,8 @@ OpenRAG chose Docling for its support for a wide variety of file formats, high p
|
||||||
|
|
||||||
These settings configure the Docling ingestion parameters.
|
These settings configure the Docling ingestion parameters.
|
||||||
|
|
||||||
OpenRAG will warn you if `docling-serve` is not running.
|
OpenRAG will warn you if `docling serve` is not running.
|
||||||
To start or stop `docling-serve` or any other native services, in the TUI main menu, click **Start Native Services** or **Stop Native Services**.
|
To start or stop `docling serve` or any other native services, in the TUI main menu, click **Start Native Services** or **Stop Native Services**.
|
||||||
|
|
||||||
**Embedding model** determines which AI model is used to create vector embeddings. The default is `text-embedding-3-small`.
|
**Embedding model** determines which AI model is used to create vector embeddings. The default is `text-embedding-3-small`.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -12,6 +12,8 @@ They deploy the same applications and containers, but to different environments.
|
||||||
|
|
||||||
- [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without GPU support. Use this Docker compose file for environments where GPU drivers aren't available.
|
- [`docker-compose-cpu.yml`](https://github.com/langflow-ai/openrag/blob/main/docker-compose-cpu.yml) is a CPU-only version of OpenRAG for systems without GPU support. Use this Docker compose file for environments where GPU drivers aren't available.
|
||||||
|
|
||||||
|
Both Docker deployments depend on `docling serve` to be running on port `5001` on the host machine. This enables [Mac MLX](https://opensource.apple.com/projects/mlx/) support for document processing. Installing OpenRAG with the TUI starts `docling serve` automatically, but for a Docker deployment you must manually start the `docling serve` process.
|
||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
- [Python Version 3.10 to 3.13](https://www.python.org/downloads/release/python-3100/)
|
- [Python Version 3.10 to 3.13](https://www.python.org/downloads/release/python-3100/)
|
||||||
|
|
@ -31,7 +33,12 @@ To install OpenRAG with Docker Compose, do the following:
|
||||||
cd openrag
|
cd openrag
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Copy the example `.env` file included in the repository root.
|
2. Install dependencies.
|
||||||
|
```bash
|
||||||
|
uv sync
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Copy the example `.env` file included in the repository root.
|
||||||
The example file includes all environment variables with comments to guide you in finding and setting their values.
|
The example file includes all environment variables with comments to guide you in finding and setting their values.
|
||||||
```bash
|
```bash
|
||||||
cp .env.example .env
|
cp .env.example .env
|
||||||
|
|
@ -42,7 +49,7 @@ To install OpenRAG with Docker Compose, do the following:
|
||||||
touch .env
|
touch .env
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Set environment variables. The Docker Compose files will be populated with values from your `.env`.
|
4. Set environment variables. The Docker Compose files will be populated with values from your `.env`.
|
||||||
The following values are **required** to be set:
|
The following values are **required** to be set:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|
@ -55,14 +62,35 @@ The following values are **required** to be set:
|
||||||
|
|
||||||
For more information on configuring OpenRAG with environment variables, see [Environment variables](/reference/configuration).
|
For more information on configuring OpenRAG with environment variables, see [Environment variables](/reference/configuration).
|
||||||
|
|
||||||
4. Deploy OpenRAG with Docker Compose based on your deployment type.
|
5. Start `docling serve` on the host machine.
|
||||||
|
Both Docker deployments depend on `docling serve` to be running on port `5001` on the host machine. This enables [Mac MLX](https://opensource.apple.com/projects/mlx/) support for document processing.
|
||||||
For GPU-enabled systems, run the following command:
|
|
||||||
```bash
|
```bash
|
||||||
|
uv run python scripts/docling_ctl.py start --port 5001
|
||||||
|
```
|
||||||
|
|
||||||
|
6. Confirm `docling serve` is running.
|
||||||
|
```
|
||||||
|
uv run python scripts/docling_ctl.py status
|
||||||
|
```
|
||||||
|
|
||||||
|
Successful result:
|
||||||
|
```bash
|
||||||
|
Status: running
|
||||||
|
Endpoint: http://127.0.0.1:5001
|
||||||
|
Docs: http://127.0.0.1:5001/docs
|
||||||
|
PID: 27746
|
||||||
|
```
|
||||||
|
|
||||||
|
7. Deploy OpenRAG with Docker Compose based on your deployment type.
|
||||||
|
|
||||||
|
For GPU-enabled systems, run the following commands:
|
||||||
|
```bash
|
||||||
|
docker compose build
|
||||||
docker compose up -d
|
docker compose up -d
|
||||||
```
|
```
|
||||||
|
|
||||||
For CPU-only systems, run the following command:
|
For environments without GPU support, run:
|
||||||
```bash
|
```bash
|
||||||
docker compose -f docker-compose-cpu.yml up -d
|
docker compose -f docker-compose-cpu.yml up -d
|
||||||
```
|
```
|
||||||
|
|
@ -76,7 +104,7 @@ The following values are **required** to be set:
|
||||||
| OpenSearch | http://localhost:9200 | Vector database for document storage. |
|
| OpenSearch | http://localhost:9200 | Vector database for document storage. |
|
||||||
| OpenSearch Dashboards | http://localhost:5601 | Database administration interface. |
|
| OpenSearch Dashboards | http://localhost:5601 | Database administration interface. |
|
||||||
|
|
||||||
5. Verify installation by confirming all services are running.
|
8. Verify installation by confirming all services are running.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker compose ps
|
docker compose ps
|
||||||
|
|
@ -88,7 +116,13 @@ The following values are **required** to be set:
|
||||||
- **Backend API**: http://localhost:8000
|
- **Backend API**: http://localhost:8000
|
||||||
- **Langflow**: http://localhost:7860
|
- **Langflow**: http://localhost:7860
|
||||||
|
|
||||||
6. Continue with [Application Onboarding](#application-onboarding).
|
9. Continue with [Application Onboarding](#application-onboarding).
|
||||||
|
|
||||||
|
To stop `docling serve` when you're done with your OpenRAG deployment, run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv run python scripts/docling_ctl.py stop
|
||||||
|
```
|
||||||
|
|
||||||
<PartialOnboarding />
|
<PartialOnboarding />
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -51,9 +51,9 @@ If images are missing, the TUI runs `docker compose pull`, then runs `docker com
|
||||||
### Start native services
|
### Start native services
|
||||||
|
|
||||||
A "native" service in OpenRAG refers to a service run natively on your machine, and not within a container.
|
A "native" service in OpenRAG refers to a service run natively on your machine, and not within a container.
|
||||||
The `docling-serve` process is a native service in OpenRAG, because it's a document processing service that is run on your local machine, and controlled separately from the containers.
|
The `docling serve` process is a native service in OpenRAG, because it's a document processing service that is run on your local machine, and controlled separately from the containers.
|
||||||
|
|
||||||
To start or stop `docling-serve` or any other native services, in the TUI main menu, click **Start Native Services** or **Stop Native Services**.
|
To start or stop `docling serve` or any other native services, in the TUI main menu, click **Start Native Services** or **Stop Native Services**.
|
||||||
|
|
||||||
To view the status, port, or PID of a native service, in the TUI main menu, click [Status](#status).
|
To view the status, port, or PID of a native service, in the TUI main menu, click [Status](#status).
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -134,9 +134,11 @@ Configure general system components, session management, and logging.
|
||||||
|----------|---------|-------------|
|
|----------|---------|-------------|
|
||||||
| `LANGFLOW_KEY_RETRIES` | `15` | Number of retries for Langflow key generation. |
|
| `LANGFLOW_KEY_RETRIES` | `15` | Number of retries for Langflow key generation. |
|
||||||
| `LANGFLOW_KEY_RETRY_DELAY` | `2.0` | Delay between retries in seconds. |
|
| `LANGFLOW_KEY_RETRY_DELAY` | `2.0` | Delay between retries in seconds. |
|
||||||
|
| `LANGFLOW_VERSION` | `latest` | Langflow Docker image version. |
|
||||||
| `LOG_FORMAT` | - | Log format (set to "json" for JSON output). |
|
| `LOG_FORMAT` | - | Log format (set to "json" for JSON output). |
|
||||||
| `LOG_LEVEL` | `INFO` | Logging level (DEBUG, INFO, WARNING, ERROR). |
|
| `LOG_LEVEL` | `INFO` | Logging level (DEBUG, INFO, WARNING, ERROR). |
|
||||||
| `MAX_WORKERS` | - | Maximum number of workers for document processing. |
|
| `MAX_WORKERS` | - | Maximum number of workers for document processing. |
|
||||||
|
| `OPENRAG_VERSION` | `latest` | OpenRAG Docker image version. |
|
||||||
| `SERVICE_NAME` | `openrag` | Service name for logging. |
|
| `SERVICE_NAME` | `openrag` | Service name for logging. |
|
||||||
| `SESSION_SECRET` | auto-generated | Session management. |
|
| `SESSION_SECRET` | auto-generated | Session management. |
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue