HectorSin fbead80a36 docs: setup documentation structure for i18n (en/ko)

Signed-off-by: HectorSin <kkang15634@ajou.ac.kr>

2026-01-14 12:17:24 +09:00

2.7 KiB

Raw Blame History

Datasets

Project-level containers for organization, permissions, and processing

What is a dataset in Cognee?

A dataset is a named container that groups documents and their metadata. It is the main boundary for:

Organizing content
Running pipelines
Applying permissions

**Dataset isolation** requires specific configuration. See [permissions system](../permissions-system/datasets#dataset-isolation) for details on access control requirements and supported database setups.

Add:
- Direct new content into a specific dataset (by name or ID)
- If it doesn’t exist, Cognee creates it and associates your permissions
- Items ingested are linked to that dataset and deduplicated within it
Cognify:
- Choose which dataset(s) to transform into a knowledge graph
- Loads the dataset’s content, checks rights, and runs the pipeline per dataset
- If none are specified, processes all datasets you’re authorized to use
- Progress is tracked per dataset for reliable re-runs
Search:
- Queries can be scoped by dataset
- Results and metrics remain separated by dataset

Access control

Permissions (read, write, share, delete) are enforced at the dataset level
Share one dataset with a team, keep another private
Independently manage who can modify or distribute content

Incremental processing

Processing status is tracked per dataset
After you add more data, Cognify focuses on new or changed items
Skips what’s already completed for that dataset

Datasets vs NodeSets

Datasets scope storage, permissions, and pipeline execution; NodeSets are semantic tags within a dataset.

During Add, you can label items with one or more NodeSet names (e.g., "AI", "FinTech")
Cognify propagates those labels into the graph by creating NodeSet nodes and linking derived chunks and entities via belongs_to_set relationships
This lets you slice a single dataset’s graph by topic or team without creating new datasets, while dataset-level permissions still control overall access

Direct content into a dataset Run pipelines per dataset Scope queries by dataset

To find navigation and other pages in this documentation, fetch the llms.txt file at: https://docs.cognee.ai/llms.txt

2.7 KiB Raw Blame History Unescape Escape