cognee/README.md
2023-11-08 16:00:30 +01:00

9.5 KiB
Raw Blame History

PromethAI-Memory

AI Applications and RAGs - Cognitive Architecture, Testability, Production Ready Apps

promethAI logo

Open-source framework for building and testing RAGs and Cognitive Architectures, designed for accuracy, transparency, and control.

promethAI forks promethAI stars promethAI pull-requests

Share promethAI Repository

Follow _promethAI Share on Telegram Share on Reddit Buy Me A Coffee


This repo is built to test and evolve RAG architecture, inspired by human cognitive processes, using Python. It's aims to be production ready, testable, but give great visibility in how we build RAG applications.

This project is a part of the PromethAI ecosystem.

It runs in iterations, with each iteration building on the previous one.

Keep Ithaka always in your mind. Arriving there is what youre destined for. But dont hurry the journey at all. Better if it lasts for years

Installation

To get started with PromethAI Memory, start with the latest iteration, and follow the instructions in the README.md file

Current Focus

RAG test manager can be used via API or via the CLI

Image

Project Structure

Level 1 - OpenAI functions + Pydantic + DLTHub

Scope: Give PDFs to the model and get the output in a structured format Blog post: Link We introduce the following concepts:

  • Structured output with Pydantic
  • CMD script to process custom PDFs

Level 2 - Memory Manager + Metadata management

Scope: Give PDFs to the model and consolidate with the previous user activity and more Blog post: Link We introduce the following concepts:

  • Long Term Memory -> store and format the data
  • Episodic Buffer -> isolate the working memory
  • Attention Modulators -> improve semantic search
  • Docker
  • API

Level 3 - Dynamic Graph Memory Manager + DB + Rag Test Manager

Scope: Store the data in N-related stores and test the retrieval with the Rag Test Manager Blog post: Link

  • Dynamic Memory Manager -> store the data in N hierarchical stores
  • Auto-generation of tests
  • Multiple file formats supported
  • Postgres DB to store metadata
  • Docker
  • API
  • Superset to visualize the results

Level 4 - Dynamic Graph Memory Manager + DB + Rag Test Manager

Scope: Use Neo4j to map the user queries into a knowledge graph based on cognitive architecture Blog post: Soon!

  • Dynamic Memory Manager -> store the data in N hierarchical stores
  • Dynamic Graph -> map the user queries into a knowledge graph
  • Postgres DB to store metadata - soon
  • Docker
  • API - soon

Run the level 4

Make sure you have Docker, Poetry, and Python 3.11 installed and postgres installed.

Copy the .env.example to .env and fill in the variables

poetry shell

docker compose up

Run

python main.py

Run the level 3

Make sure you have Docker, Poetry, and Python 3.11 installed and postgres installed.

Copy the .env.example to .env and fill in the variables

Two ways to run the level 3:

Docker:

Copy the .env.template to .env and fill in the variables Specify the environment variable in the .env file to "docker"

Launch the docker image:

docker compose up promethai_mem

Send the request to the API:

curl -X POST -H "Content-Type: application/json" -d '{
  "payload": {
    "user_id": "97980cfea0067",
    "data": [".data/3ZCCCW.pdf"],
    "test_set": "sample",
    "params": ["chunk_size"],
    "metadata": "sample",
    "retriever_type": "single_document_context"
  }
}' http://0.0.0.0:8000/rag-test/rag_test_run
 

Params:

  • data -> list of URLs or path to the file, located in the .data folder (pdf, docx, txt, html)
  • test_set -> sample, manual (list of questions and answers)
  • metadata -> sample, manual (json) or version (in progress)
  • params -> chunk_size, chunk_overlap, search_type (hybrid, bm25), embeddings
  • retriever_type -> llm_context, single_document_context, multi_document_context, cognitive_architecture(coming soon)

Inspect the results in the DB:

docker exec -it postgres psql -U bla

\c bubu

select * from test_outputs;

Or set up the superset to visualize the results. The base SQL query is in the example_data folder.

Poetry environment:

Copy the .env.template to .env and fill in the variables Specify the environment variable in the .env file to "local"

Use the poetry environment:

poetry shell

Change the .env file Environment variable to "local"

Launch the postgres DB

docker compose up postgres

Launch the superset

docker compose up superset

Open the superset in your browser

http://localhost:8088 Add the Postgres datasource to the Superset with the following connection string:

postgres://bla:bla@postgres:5432/bubu

Make sure to run to initialize DB tables

python scripts/create_database.py

After that, you can run the RAG test manager from your command line.

    python rag_test_manager.py \
    --file ".data" \
    --test_set "example_data/test_set.json" \
    --user_id "97980cfea0067" \
    --params "chunk_size" "search_type" \
    --metadata "example_data/metadata.json" \
    --retriever_type "single_document_context"

Examples of metadata structure and test set are in the folder "example_data"