# DataPoints > Atomic units of knowledge in Cognee # DataPoints: Atomic Units of Knowledge DataPoints are the smallest building blocks in Cognee.\ They represent **atomic units of knowledge** — carrying both your actual content and the context needed to process, index, and connect it. They're the reason Cognee can turn raw documents into something that's both **searchable** (via vectors) and **connected** (via graphs). ## What are DataPoints * **Atomic** — each DataPoint represents one concept or unit of information. * **Structured** — implemented as [Pydantic](https://docs.pydantic.dev/) models for validation and serialization. * **Contextual** — carry provenance, versioning, and indexing hints so every step downstream knows where data came from and how to use it. ## Core Structure A DataPoint is just a Pydantic model with a set of standard fields. ```python theme={null} class DataPoint(BaseModel): id: UUID = Field(default_factory=uuid4) created_at: int = ... updated_at: int = ... version: int = 1 topological_rank: Optional[int] = 0 metadata: Optional[dict] = {"index_fields": []} type: str = "DataPoint" belongs_to_set: Optional[List["DataPoint"]] = None ``` Key fields: * `id` — unique identifier * `created_at`, `updated_at` — timestamps (ms since epoch) * `version` — for tracking changes and schema evolution * `metadata.index_fields` — critical: determines which fields are embedded for vector search * `type` — class name * `belongs_to_set` — groups related DataPoints ## Indexing & Embeddings The `metadata.index_fields` tells Cognee which fields to embed into the vector store. This is the mechanism behind semantic search. * Fields in `index_fields` → converted into embeddings * Each indexed field → its own vector collection (`Class_field`) * Non-indexed fields → stay as regular properties * Choosing what to index controls search granularity ## From DataPoints to the Graph When you call `add_data_points()`, Cognee automatically: * Embeds the indexed fields into vectors * Converts the object into **nodes** and **edges** in the knowledge graph * Stores provenance in the relational store This is how Cognee creates both **semantic similarity** (vector) and **structural reasoning** (graph) from the same unit. ## Examples and details ```python theme={null} class Person(DataPoint): name: str age: int metadata: dict = {"index_fields": ["name"]} ``` Only `"name"` is semantically searchable ```python theme={null} class Book(DataPoint): title: str author: Author metadata: dict = {"index_fields": ["title"]} # Produces: # `Node(Book)` with `{title, type, ...}` # Node(Author) with {name, type, ...} # Edge(Book → Author, type="author") ``` ```python theme={null} # Simple relationship `author: Author` # With edge metadata `has_items: (Edge(weight=0.8), list[Item])` # List relationship `chapters: list[Chapter]` ``` Cognee ships with several built-in DataPoint types: * **Documents** — wrappers for source files (Text, PDF, Audio, Image) * `Document` (`metadata.index_fields=["name"]`) * **Chunks** — segmented portions of documents * `DocumentChunk` (`metadata.index_fields=["text"]`) * **Summaries** — generated text or code summaries * `TextSummary` / `CodeSummary` (`metadata.index_fields=["text"]`) * **Entities** — named objects (people, places, concepts) * `Entity`, `EntityType` (`metadata.index_fields=["name"]`) * **Edges** — relationships between DataPoints * `Edge` — links between DataPoints ```python theme={null} class Product(DataPoint): name: str description: str price: float category: Category # Index name + description for search metadata: dict = {"index_fields": ["name", "description"]} ``` **Best Practices:** * **Keep it small** — one concept per DataPoint * **Index carefully** — only fields that matter for semantic search * **Use built-in types first** — extend with custom subclasses when needed * **Version deliberately** — track changes with `version` * **Group related points** — with `belongs_to_set` Learn how DataPoints are created and processed See how DataPoints flow through processing workflows Understand how DataPoints are used in Add, Cognify, and Search --- > To find navigation and other pages in this documentation, fetch the llms.txt file at: https://docs.cognee.ai/llms.txt