3.4 KiB
3.4 KiB
Pipelines
Orchestrating tasks into coordinated workflows for data processing
What pipelines are
Pipelines coordinate ordered Tasks into a reproducible workflow. Default Cognee operations like Add and Cognify run on top of the same execution layer. You typically do not call low-level functions directly; you trigger pipelines through these operations.
Prerequisites
- Dataset: a container (name or UUID) where your data is stored and processed. Every document added to cognee belongs to a dataset.
- User: the identity for ownership and access control. A default user is created and used if none is provided.
- More details are available below
How pipelines run
Somewhat unsurprisingly, the function used to run pipelines is called run_pipeline.
Cognee uses a layered execution model: a single call to run_pipeline orchestrates multi-dataset processing by running per-file pipelines through the sequence of tasks.
- Statuses are yielded as the pipeline runs and written to databases where appropriate
- User access to datasets and files is carefully verified at each layer
- Pipeline run information includes dataset IDs, completion status, and error handling
- Background execution uses queues to manage status updates and avoid database conflicts
To find navigation and other pages in this documentation, fetch the llms.txt file at: https://docs.cognee.ai/llms.txt