* Update cognify and the networkx client to prepare for running in Neo4j * Fix for openai model * Add the fix to the infra so that the models can be passed to the library. Enable llm_provider to be passed. * Auto graph generation now works with neo4j * Added fixes for both neo4j and networkx * Explicitly name semantic node connections * Added updated docs, readme, chunkers and updates to cognify * Make docs build trigger only when changes on it happen * Update docs, test git actions * Separate cognify logic into tasks * Introduce dspy knowledge graph extraction --------- Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
4.5 KiB
RAG Stack
Core elements of a RAG stack are the building blocks that we can use to get to more personalized and deterministic outputs.
!!! tip "This is a work in progress and any feedback is welcome"
What is a RAG?
!!! note "What is RAG?" RAG stands for Retrieval Augmented Generation. It is a model that combines the power of large language models (LLMs) like GPT-4 with the efficiency of information retrieval systems. The goal of RAG is to generate text that is both fluent and factually accurate by retrieving relevant information from a knowledge base.
To try building a simple RAG and understand the limitations, check out this simple guide with examples: RAGs: Retrieval-Augmented Generation Explained
The Building Blocks of a RAG Stack
1. Data Sources
You can get your data from a variety of sources, including:
- APIs like Twitter, Reddit, and Google
- Web scraping tools like Scrapy and Beautiful Soup
- Documents like PDFs, Word, and Excel files
- Relational databases like DuckDB, PSQL and MySQL
- Data warehouses like Snowflake and Databricks
- Customer data platforms like Segment
2. Data Loaders
Image Search
Utilizes images as the input for conducting a similarity search, analyzing the content of the image to find similar images based on visual features.
Keyword Search
Employs the BM25F algorithm for ranking results based on keyword matches. Relevance is calculated using term frequency, inverse document frequency, and field-length normalization.
Hybrid Search
Merges the BM25 algorithm with vector similarity search techniques to enhance the relevance and accuracy of search results. Leverages both textual and vector-based features for ranking.
Generative Search
Utilizes the outputs of search results as prompts for a Large Language Model (LLM). Can generate summaries, extrapolations, or new content based on the aggregated search results.
Reranking
Involves the application of a reranker module to adjust the initial ranking of search results. Optimizes result relevance based on additional criteria or more complex models.
Aggregation
Involves compiling and summarizing data from a set of search results. Provides insights or overviews based on the collective information found.
Filters
Apply constraints or conditions to the search process to narrow down the results. Filters can be based on specific attributes, metadata, or other criteria relevant to the search domain.
Graph Search
Involves traversing a graph data structure to find specific nodes or paths. It can be used to find relationships between different entities in a knowledge graph.