4 KiB
Code Graph
Step-by-step guide to building code-level graphs from repositories
A minimal guide to building a code-level graph from a repository and searching it. The pipeline parses your repo, extracts code entities and dependencies, and optionally processes non-code docs alongside.
Before you start:
- Complete Quickstart to understand basic operations
- Ensure you have LLM Providers configured
- Have a local repository path (absolute or relative)
What Code Graph Does
- Scans a repo for supported languages and builds code nodes/edges (files, symbols, imports, call/dependency links)
- Optional: includes non-code files (markdown, docs) as a standard knowledge graph
- Enables
SearchType.CODEfor code-aware queries
Code in Action
import asyncio
import cognee
from cognee import SearchType
from cognee.api.v1.cognify.code_graph_pipeline import run_code_graph_pipeline
async def main():
repo_path = "/path/to/your/repo" # folder root
# Build the code graph (code only)
async for _ in run_code_graph_pipeline(repo_path, include_docs=False):
pass
# Ask a code question
results = await cognee.search(query_type=SearchType.CODE, query_text="Where is Foo used?")
print(results)
asyncio.run(main())
This simple example uses a basic repository for demonstration. In practice, you can process large codebases with multiple languages and complex dependency structures.
What Just Happened
Step 1: Build the Code Graph
async for _ in run_code_graph_pipeline(repo_path, include_docs=False):
pass
This scans your repository for supported languages and builds code nodes/edges. The pipeline handles file parsing, symbol extraction, and dependency analysis automatically.
Step 2: Search Your Code
results = await cognee.search(query_type=SearchType.CODE, query_text="Where is Foo used?")
Use SearchType.CODE to ask code-aware questions about your repository. This searches through the extracted code structure, not just text content.
Include Documentation (Optional)
Also process non-code files from the repo (slower, uses LLM for text):
async for _ in run_code_graph_pipeline(repo_path, include_docs=True):
pass
This processes markdown files, documentation, and other text files alongside your code, creating a comprehensive knowledge graph.
Advanced Options
async for _ in run_code_graph_pipeline(
repo_path,
include_docs=False,
excluded_paths=["**/node_modules/**", "**/dist/**"],
supported_languages=["python", "typescript"],
):
pass
excluded_paths: List of paths (globs) to skip, e.g., tests, build folderssupported_languages: Narrow to certain languages to speed up processing
Visualize Your Graph (Optional)
from cognee.api.v1.visualize.visualize import visualize_graph
await visualize_graph("./graph_code.html")
Generate an HTML visualization of your code graph to explore the structure and relationships.
What Happens Under the Hood
run_code_graph_pipeline(...) automatically handles:
- Repository scanning and file parsing
- Code entity extraction (functions, classes, imports, calls)
- Dependency analysis and relationship mapping
- Database initialization and setup
- Optional documentation processing with LLM
Once complete, your code graph is ready for search and analysis.
Learn about custom tasks and pipelines Understand knowledge graph fundamentals Explore API endpointsTo find navigation and other pages in this documentation, fetch the llms.txt file at: https://docs.cognee.ai/llms.txt