Fix diagram
This commit is contained in:
parent
c0ad53cde7
commit
b73a35fec6
3 changed files with 4 additions and 83 deletions
87
README.md
87
README.md
|
|
@ -193,93 +193,14 @@ if __name__ == '__main__':
|
||||||
When you run this script, you will see step-by-step messages in the console that help you trace the execution flow and understand what the script is doing at each stage.
|
When you run this script, you will see step-by-step messages in the console that help you trace the execution flow and understand what the script is doing at each stage.
|
||||||
A version of this example is here: `examples/python/simple_example.py`
|
A version of this example is here: `examples/python/simple_example.py`
|
||||||
|
|
||||||
### Create your own memory store
|
### Understand our architecture
|
||||||
|
|
||||||
cognee framework consists of tasks that can be grouped into pipelines.
|
cognee framework consists of tasks that can be grouped into pipelines.
|
||||||
Each task can be an independent part of business logic, that can be tied to other tasks to form a pipeline.
|
Each task can be an independent part of business logic, that can be tied to other tasks to form a pipeline.
|
||||||
These tasks persist data into your memory store enabling you to search for relevant context of past conversations, documents, or any other data you have stored.
|
These tasks persist data into your memory store enabling you to search for relevant context of past conversations, documents, or any other data you have stored.
|
||||||
|
<div align="center">
|
||||||
|
<img src="assets/cognee_diagram.png" alt="cognee concept diagram" width="50%" />
|
||||||
### Example: Classify your documents
|
</div>
|
||||||
|
|
||||||
Here is an example of how it looks for a default cognify pipeline:
|
|
||||||
|
|
||||||
1. To prepare the data for the pipeline run, first we need to add it to our metastore and normalize it:
|
|
||||||
|
|
||||||
Start with:
|
|
||||||
```
|
|
||||||
text = """Natural language processing (NLP) is an interdisciplinary
|
|
||||||
subfield of computer science and information retrieval"""
|
|
||||||
|
|
||||||
await cognee.add(text) # Add a new piece of information
|
|
||||||
```
|
|
||||||
|
|
||||||
2. In the next step we make a task. The task can be any business logic we need, but the important part is that it should be encapsulated in one function.
|
|
||||||
|
|
||||||
Here we show an example of creating a naive LLM classifier that takes a Pydantic model and then stores the data in both the graph and vector stores after analyzing each chunk.
|
|
||||||
We provided just a snippet for reference, but feel free to check out the implementation in our repo.
|
|
||||||
|
|
||||||
```
|
|
||||||
async def chunk_naive_llm_classifier(
|
|
||||||
data_chunks: list[DocumentChunk],
|
|
||||||
classification_model: Type[BaseModel]
|
|
||||||
):
|
|
||||||
# Extract classifications asynchronously
|
|
||||||
chunk_classifications = await asyncio.gather(
|
|
||||||
*(extract_categories(chunk.text, classification_model) for chunk in data_chunks)
|
|
||||||
)
|
|
||||||
|
|
||||||
# Collect classification data points using a set to avoid duplicates
|
|
||||||
classification_data_points = {
|
|
||||||
uuid5(NAMESPACE_OID, cls.label.type)
|
|
||||||
for cls in chunk_classifications
|
|
||||||
} | {
|
|
||||||
uuid5(NAMESPACE_OID, subclass.value)
|
|
||||||
for cls in chunk_classifications
|
|
||||||
for subclass in cls.label.subclass
|
|
||||||
}
|
|
||||||
|
|
||||||
vector_engine = get_vector_engine()
|
|
||||||
collection_name = "classification"
|
|
||||||
|
|
||||||
# Define the payload schema
|
|
||||||
class Keyword(BaseModel):
|
|
||||||
uuid: str
|
|
||||||
text: str
|
|
||||||
chunk_id: str
|
|
||||||
document_id: str
|
|
||||||
|
|
||||||
# Ensure the collection exists and retrieve existing data points
|
|
||||||
if not await vector_engine.has_collection(collection_name):
|
|
||||||
await vector_engine.create_collection(collection_name, payload_schema=Keyword)
|
|
||||||
existing_points_map = {}
|
|
||||||
else:
|
|
||||||
existing_points_map = {}
|
|
||||||
return data_chunks
|
|
||||||
|
|
||||||
...
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
We have many tasks that can be used in your pipelines, and you can also create your tasks to fit your business logic.
|
|
||||||
|
|
||||||
|
|
||||||
3. Once we have our tasks, it is time to group them into a pipeline.
|
|
||||||
This simplified snippet demonstrates how tasks can be added to a pipeline, and how they can pass the information forward from one to another.
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
Task(
|
|
||||||
chunk_naive_llm_classifier,
|
|
||||||
classification_model = cognee_config.classification_model,
|
|
||||||
)
|
|
||||||
|
|
||||||
pipeline = run_tasks(tasks, documents)
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
To see the working code, check cognee.api.v1.cognify default pipeline in our repo.
|
|
||||||
|
|
||||||
|
|
||||||
## Vector retrieval, Graphs and LLMs
|
## Vector retrieval, Graphs and LLMs
|
||||||
|
|
|
||||||
Binary file not shown.
|
Before Width: | Height: | Size: 77 KiB |
BIN
assets/cognee_diagram.png
Normal file
BIN
assets/cognee_diagram.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 428 KiB |
Loading…
Add table
Reference in a new issue