Main merge vol9 (#1994)

<!-- .github/pull_request_template.md -->

## Description
Resolve conflict and merge commits from main to dev

## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->

## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Add top_k to control number of search results
* Add verbose option to include/exclude detailed graphs in search output

* **Improvements**
  * Examples now use pretty-printed output for clearer readability
* Startup handles migration failures more gracefully with a fallback
initialization path

* **Documentation**
* Updated contributing guidance and added explicit run instructions for
examples

* **Chores**
  * Project version bumped to 0.5.1
  * Adjusted frontend framework version constraint

* **Tests**
  * Updated tests to exercise verbose search behavior

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
Igor Ilic 2026-01-13 17:28:03 +01:00 committed by GitHub
commit dd16ba89c3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
28 changed files with 5725 additions and 7372 deletions

View file

@ -76,7 +76,7 @@ git clone https://github.com/<your-github-username>/cognee.git
cd cognee
```
In case you are working on Vector and Graph Adapters
1. Fork the [**cognee**](https://github.com/topoteretes/cognee-community) repository
1. Fork the [**cognee-community**](https://github.com/topoteretes/cognee-community) repository
2. Clone your fork:
```shell
git clone https://github.com/<your-github-username>/cognee-community.git
@ -120,6 +120,21 @@ or
uv run python examples/python/simple_example.py
```
### Running Simple Example
Change .env.example into .env and provide your OPENAI_API_KEY as LLM_API_KEY
Make sure to run ```shell uv sync ``` in the root cloned folder or set up a virtual environment to run cognee
```shell
python cognee/cognee/examples/python/simple_example.py
```
or
```shell
uv run python cognee/cognee/examples/python/simple_example.py
```
## 4. 📤 Submitting Changes
1. Make sure that `pre-commit` and hooks are installed. See `Required tools` section for more information. Try executing `pre-commit run` if you are not sure.

View file

@ -126,6 +126,7 @@ Now, run a minimal pipeline:
```python
import cognee
import asyncio
from pprint import pprint
async def main():
@ -143,7 +144,7 @@ async def main():
# Display the results
for result in results:
print(result)
pprint(result)
if __name__ == '__main__':

File diff suppressed because it is too large Load diff

View file

@ -13,7 +13,7 @@
"classnames": "^2.5.1",
"culori": "^4.0.1",
"d3-force-3d": "^3.0.6",
"next": "^16.1.7",
"next": "^16.1.1",
"react": "^19.2.3",
"react-dom": "^19.2.3",
"react-force-graph-2d": "^1.27.1",

View file

@ -192,7 +192,7 @@ class CogneeClient:
with redirect_stdout(sys.stderr):
results = await self.cognee.search(
query_type=SearchType[query_type.upper()], query_text=query_text
query_type=SearchType[query_type.upper()], query_text=query_text, top_k=top_k
)
return results

View file

@ -316,7 +316,7 @@ async def save_interaction(data: str) -> list:
@mcp.tool()
async def search(search_query: str, search_type: str) -> list:
async def search(search_query: str, search_type: str, top_k: int = 10) -> list:
"""
Search and query the knowledge graph for insights, information, and connections.
@ -389,6 +389,13 @@ async def search(search_query: str, search_type: str) -> list:
The search_type is case-insensitive and will be converted to uppercase.
top_k : int, optional
Maximum number of results to return (default: 10).
Controls the amount of context retrieved from the knowledge graph.
- Lower values (3-5): Faster, more focused results
- Higher values (10-20): More comprehensive, but slower and more context-heavy
Helps manage response size and context window usage in MCP clients.
Returns
-------
list
@ -425,13 +432,32 @@ async def search(search_query: str, search_type: str) -> list:
"""
async def search_task(search_query: str, search_type: str) -> str:
"""Search the knowledge graph"""
async def search_task(search_query: str, search_type: str, top_k: int) -> str:
"""
Internal task to execute knowledge graph search with result formatting.
Handles the actual search execution and formats results appropriately
for MCP clients based on the search type and execution mode (API vs direct).
Parameters
----------
search_query : str
The search query in natural language
search_type : str
Type of search to perform (GRAPH_COMPLETION, CHUNKS, etc.)
top_k : int
Maximum number of results to return
Returns
-------
str
Formatted search results as a string, with format depending on search_type
"""
# NOTE: MCP uses stdout to communicate, we must redirect all output
# going to stdout ( like the print function ) to stderr.
with redirect_stdout(sys.stderr):
search_results = await cognee_client.search(
query_text=search_query, query_type=search_type
query_text=search_query, query_type=search_type, top_k=top_k
)
# Handle different result formats based on API vs direct mode
@ -465,7 +491,7 @@ async def search(search_query: str, search_type: str) -> list:
else:
return str(search_results)
search_results = await search_task(search_query, search_type)
search_results = await search_task(search_query, search_type, top_k)
return [types.TextContent(type="text", text=search_results)]

View file

@ -36,6 +36,7 @@ async def search(
session_id: Optional[str] = None,
wide_search_top_k: Optional[int] = 100,
triplet_distance_penalty: Optional[float] = 3.5,
verbose: bool = False,
) -> Union[List[SearchResult], CombinedSearchResult]:
"""
Search and query the knowledge graph for insights, information, and connections.
@ -126,6 +127,8 @@ async def search(
session_id: Optional session identifier for caching Q&A interactions. Defaults to 'default_session' if None.
verbose: If True, returns detailed result information including graph representation (when possible).
Returns:
list: Search results in format determined by query_type:
@ -218,6 +221,7 @@ async def search(
session_id=session_id,
wide_search_top_k=wide_search_top_k,
triplet_distance_penalty=triplet_distance_penalty,
verbose=verbose,
)
return filtered_search_results

View file

@ -15,3 +15,9 @@ async def setup():
"""
await create_relational_db_and_tables()
await create_pgvector_db_and_tables()
if __name__ == "__main__":
import asyncio
asyncio.run(setup())

View file

@ -49,6 +49,7 @@ async def search(
session_id: Optional[str] = None,
wide_search_top_k: Optional[int] = 100,
triplet_distance_penalty: Optional[float] = 3.5,
verbose: bool = False,
) -> Union[CombinedSearchResult, List[SearchResult]]:
"""
@ -140,6 +141,7 @@ async def search(
)
if use_combined_context:
# Note: combined context search must always be verbose and return a CombinedSearchResult with graphs info
prepared_search_results = await prepare_search_result(
search_results[0] if isinstance(search_results, list) else search_results
)
@ -173,25 +175,30 @@ async def search(
datasets = prepared_search_results["datasets"]
if only_context:
return_value.append(
{
"search_result": [context] if context else None,
"dataset_id": datasets[0].id,
"dataset_name": datasets[0].name,
"dataset_tenant_id": datasets[0].tenant_id,
"graphs": graphs,
}
)
search_result_dict = {
"search_result": [context] if context else None,
"dataset_id": datasets[0].id,
"dataset_name": datasets[0].name,
"dataset_tenant_id": datasets[0].tenant_id,
}
if verbose:
# Include graphs only in verbose mode
search_result_dict["graphs"] = graphs
return_value.append(search_result_dict)
else:
return_value.append(
{
"search_result": [result] if result else None,
"dataset_id": datasets[0].id,
"dataset_name": datasets[0].name,
"dataset_tenant_id": datasets[0].tenant_id,
"graphs": graphs,
}
)
search_result_dict = {
"search_result": [result] if result else None,
"dataset_id": datasets[0].id,
"dataset_name": datasets[0].name,
"dataset_tenant_id": datasets[0].tenant_id,
}
if verbose:
# Include graphs only in verbose mode
search_result_dict["graphs"] = graphs
return_value.append(search_result_dict)
return return_value
else:
return_value = []

View file

@ -92,7 +92,7 @@ async def cognee_network_visualization(graph_data, destination_file_path: str =
}
links_list.append(link_data)
html_template = """
html_template = r"""
<!DOCTYPE html>
<html>
<head>

View file

@ -12,7 +12,7 @@ logger = get_logger("extract_usage_frequency")
async def extract_usage_frequency(
subgraphs: List[CogneeGraph],
time_window: timedelta = timedelta(days=7),
min_interaction_threshold: int = 1
min_interaction_threshold: int = 1,
) -> Dict[str, Any]:
"""
Extract usage frequency from CogneeUserInteraction nodes.
@ -48,11 +48,13 @@ async def extract_usage_frequency(
# Find all CogneeUserInteraction nodes
interaction_nodes = {}
for node_id, node in subgraph.nodes.items():
node_type = node.attributes.get('type') or node.attributes.get('node_type')
node_type = node.attributes.get("type") or node.attributes.get("node_type")
if node_type == 'CogneeUserInteraction':
if node_type == "CogneeUserInteraction":
# Parse and validate timestamp
timestamp_value = node.attributes.get('timestamp') or node.attributes.get('created_at')
timestamp_value = node.attributes.get("timestamp") or node.attributes.get(
"created_at"
)
if timestamp_value is not None:
try:
# Handle various timestamp formats
@ -81,20 +83,20 @@ async def extract_usage_frequency(
else:
# ISO format string
interaction_time = datetime.fromisoformat(timestamp_value)
elif hasattr(timestamp_value, 'to_native'):
elif hasattr(timestamp_value, "to_native"):
# Neo4j datetime object - convert to Python datetime
interaction_time = timestamp_value.to_native()
elif hasattr(timestamp_value, 'year') and hasattr(timestamp_value, 'month'):
elif hasattr(timestamp_value, "year") and hasattr(timestamp_value, "month"):
# Datetime-like object - extract components
try:
interaction_time = datetime(
year=timestamp_value.year,
month=timestamp_value.month,
day=timestamp_value.day,
hour=getattr(timestamp_value, 'hour', 0),
minute=getattr(timestamp_value, 'minute', 0),
second=getattr(timestamp_value, 'second', 0),
microsecond=getattr(timestamp_value, 'microsecond', 0)
hour=getattr(timestamp_value, "hour", 0),
minute=getattr(timestamp_value, "minute", 0),
second=getattr(timestamp_value, "second", 0),
microsecond=getattr(timestamp_value, "microsecond", 0),
)
except (AttributeError, ValueError):
pass
@ -119,23 +121,27 @@ async def extract_usage_frequency(
interaction_time = interaction_time.replace(tzinfo=None)
interaction_nodes[node_id] = {
'node': node,
'timestamp': interaction_time,
'in_window': interaction_time >= cutoff_time
"node": node,
"timestamp": interaction_time,
"in_window": interaction_time >= cutoff_time,
}
interaction_count += 1
if interaction_time >= cutoff_time:
interactions_in_window += 1
except (ValueError, TypeError, AttributeError, OSError) as e:
logger.warning(f"Failed to parse timestamp for interaction node {node_id}: {e}")
logger.debug(f"Timestamp value type: {type(timestamp_value)}, value: {timestamp_value}")
logger.warning(
f"Failed to parse timestamp for interaction node {node_id}: {e}"
)
logger.debug(
f"Timestamp value type: {type(timestamp_value)}, value: {timestamp_value}"
)
# Process edges to find graph elements used in interactions
for edge in subgraph.edges:
relationship_type = edge.attributes.get('relationship_type')
relationship_type = edge.attributes.get("relationship_type")
# Look for 'used_graph_element_to_answer' edges
if relationship_type == 'used_graph_element_to_answer':
if relationship_type == "used_graph_element_to_answer":
# node1 should be the CogneeUserInteraction, node2 is the graph element
source_id = str(edge.node1.id)
target_id = str(edge.node2.id)
@ -144,19 +150,23 @@ async def extract_usage_frequency(
if source_id in interaction_nodes:
interaction_data = interaction_nodes[source_id]
if interaction_data['in_window']:
if interaction_data["in_window"]:
# Count the graph element (target node) being used
node_frequencies[target_id] = node_frequencies.get(target_id, 0) + 1
# Also track what type of element it is for analytics
target_node = subgraph.get_node(target_id)
if target_node:
element_type = target_node.attributes.get('type') or target_node.attributes.get('node_type')
element_type = target_node.attributes.get(
"type"
) or target_node.attributes.get("node_type")
if element_type:
relationship_type_frequencies[element_type] = relationship_type_frequencies.get(element_type, 0) + 1
relationship_type_frequencies[element_type] = (
relationship_type_frequencies.get(element_type, 0) + 1
)
# Also track general edge usage patterns
elif relationship_type and relationship_type != 'used_graph_element_to_answer':
elif relationship_type and relationship_type != "used_graph_element_to_answer":
# Check if either endpoint is referenced in a recent interaction
source_id = str(edge.node1.id)
target_id = str(edge.node2.id)
@ -168,12 +178,14 @@ async def extract_usage_frequency(
# Filter frequencies above threshold
filtered_node_frequencies = {
node_id: freq for node_id, freq in node_frequencies.items()
node_id: freq
for node_id, freq in node_frequencies.items()
if freq >= min_interaction_threshold
}
filtered_edge_frequencies = {
edge_key: freq for edge_key, freq in edge_frequencies.items()
edge_key: freq
for edge_key, freq in edge_frequencies.items()
if freq >= min_interaction_threshold
}
@ -187,20 +199,19 @@ async def extract_usage_frequency(
logger.info(f"Element type distribution: {relationship_type_frequencies}")
return {
'node_frequencies': filtered_node_frequencies,
'edge_frequencies': filtered_edge_frequencies,
'element_type_frequencies': relationship_type_frequencies,
'total_interactions': interaction_count,
'interactions_in_window': interactions_in_window,
'time_window_days': time_window.days,
'last_processed_timestamp': current_time.isoformat(),
'cutoff_timestamp': cutoff_time.isoformat()
"node_frequencies": filtered_node_frequencies,
"edge_frequencies": filtered_edge_frequencies,
"element_type_frequencies": relationship_type_frequencies,
"total_interactions": interaction_count,
"interactions_in_window": interactions_in_window,
"time_window_days": time_window.days,
"last_processed_timestamp": current_time.isoformat(),
"cutoff_timestamp": cutoff_time.isoformat(),
}
async def add_frequency_weights(
graph_adapter: GraphDBInterface,
usage_frequencies: Dict[str, Any]
graph_adapter: GraphDBInterface, usage_frequencies: Dict[str, Any]
) -> None:
"""
Add frequency weights to graph nodes and edges using the graph adapter.
@ -214,8 +225,8 @@ async def add_frequency_weights(
:param graph_adapter: Graph database adapter interface
:param usage_frequencies: Calculated usage frequencies from extract_usage_frequency
"""
node_frequencies = usage_frequencies.get('node_frequencies', {})
edge_frequencies = usage_frequencies.get('edge_frequencies', {})
node_frequencies = usage_frequencies.get("node_frequencies", {})
edge_frequencies = usage_frequencies.get("edge_frequencies", {})
logger.info(f"Adding frequency weights to {len(node_frequencies)} nodes")
@ -227,15 +238,17 @@ async def add_frequency_weights(
nodes_failed = 0
# Determine which method to use based on adapter type
use_neo4j_cypher = adapter_type == 'Neo4jAdapter' and hasattr(graph_adapter, 'query')
use_kuzu_query = adapter_type == 'KuzuAdapter' and hasattr(graph_adapter, 'query')
use_get_update = hasattr(graph_adapter, 'get_node_by_id') and hasattr(graph_adapter, 'update_node_properties')
use_neo4j_cypher = adapter_type == "Neo4jAdapter" and hasattr(graph_adapter, "query")
use_kuzu_query = adapter_type == "KuzuAdapter" and hasattr(graph_adapter, "query")
use_get_update = hasattr(graph_adapter, "get_node_by_id") and hasattr(
graph_adapter, "update_node_properties"
)
# Method 1: Neo4j Cypher with SET (creates properties on the fly)
if use_neo4j_cypher:
try:
logger.info("Using Neo4j Cypher SET method")
last_updated = usage_frequencies.get('last_processed_timestamp')
last_updated = usage_frequencies.get("last_processed_timestamp")
for node_id, frequency in node_frequencies.items():
try:
@ -250,10 +263,10 @@ async def add_frequency_weights(
result = await graph_adapter.query(
query,
params={
'node_id': node_id,
'frequency': frequency,
'updated_at': last_updated
}
"node_id": node_id,
"frequency": frequency,
"updated_at": last_updated,
},
)
if result and len(result) > 0:
@ -273,9 +286,11 @@ async def add_frequency_weights(
use_neo4j_cypher = False
# Method 2: Kuzu - use get_node + add_node (updates via re-adding with same ID)
elif use_kuzu_query and hasattr(graph_adapter, 'get_node') and hasattr(graph_adapter, 'add_node'):
elif (
use_kuzu_query and hasattr(graph_adapter, "get_node") and hasattr(graph_adapter, "add_node")
):
logger.info("Using Kuzu get_node + add_node method")
last_updated = usage_frequencies.get('last_processed_timestamp')
last_updated = usage_frequencies.get("last_processed_timestamp")
for node_id, frequency in node_frequencies.items():
try:
@ -284,8 +299,8 @@ async def add_frequency_weights(
if existing_node_dict:
# Update the dict with new properties
existing_node_dict['frequency_weight'] = frequency
existing_node_dict['frequency_updated_at'] = last_updated
existing_node_dict["frequency_weight"] = frequency
existing_node_dict["frequency_updated_at"] = last_updated
# Kuzu's add_node likely just takes the dict directly, not a Node object
# Try passing the dict directly first
@ -298,15 +313,16 @@ async def add_frequency_weights(
try:
from cognee.infrastructure.engine import Node
# Try different Node constructor patterns
try:
# Pattern 1: Just properties
node_obj = Node(existing_node_dict)
except:
except Exception:
# Pattern 2: Type and properties
node_obj = Node(
type=existing_node_dict.get('type', 'Unknown'),
**existing_node_dict
type=existing_node_dict.get("type", "Unknown"),
**existing_node_dict,
)
await graph_adapter.add_node(node_obj)
@ -335,13 +351,15 @@ async def add_frequency_weights(
if node_data:
# Tweak the properties dict - add frequency_weight
if isinstance(node_data, dict):
properties = node_data.get('properties', {})
properties = node_data.get("properties", {})
else:
properties = getattr(node_data, 'properties', {}) or {}
properties = getattr(node_data, "properties", {}) or {}
# Update with frequency weight
properties['frequency_weight'] = frequency
properties['frequency_updated_at'] = usage_frequencies.get('last_processed_timestamp')
properties["frequency_weight"] = frequency
properties["frequency_updated_at"] = usage_frequencies.get(
"last_processed_timestamp"
)
# Write back via adapter
await graph_adapter.update_node_properties(node_id, properties)
@ -363,13 +381,15 @@ async def add_frequency_weights(
if node_data:
# Tweak the properties dict - add frequency_weight
if isinstance(node_data, dict):
properties = node_data.get('properties', {})
properties = node_data.get("properties", {})
else:
properties = getattr(node_data, 'properties', {}) or {}
properties = getattr(node_data, "properties", {}) or {}
# Update with frequency weight
properties['frequency_weight'] = frequency
properties['frequency_updated_at'] = usage_frequencies.get('last_processed_timestamp')
properties["frequency_weight"] = frequency
properties["frequency_updated_at"] = usage_frequencies.get(
"last_processed_timestamp"
)
# Write back via adapter
await graph_adapter.update_node_properties(node_id, properties)
@ -385,7 +405,9 @@ async def add_frequency_weights(
# If no method is available
if not use_neo4j_cypher and not use_kuzu_query and not use_get_update:
logger.error(f"Adapter {adapter_type} does not support required update methods")
logger.error("Required: either 'query' method or both 'get_node_by_id' and 'update_node_properties'")
logger.error(
"Required: either 'query' method or both 'get_node_by_id' and 'update_node_properties'"
)
return
# Update edge frequencies
@ -399,22 +421,21 @@ async def add_frequency_weights(
for edge_key, frequency in edge_frequencies.items():
try:
# Parse edge key: "relationship_type:source_id:target_id"
parts = edge_key.split(':', 2)
parts = edge_key.split(":", 2)
if len(parts) == 3:
relationship_type, source_id, target_id = parts
# Try to update edge if adapter supports it
if hasattr(graph_adapter, 'update_edge_properties'):
if hasattr(graph_adapter, "update_edge_properties"):
edge_properties = {
'frequency_weight': frequency,
'frequency_updated_at': usage_frequencies.get('last_processed_timestamp')
"frequency_weight": frequency,
"frequency_updated_at": usage_frequencies.get(
"last_processed_timestamp"
),
}
await graph_adapter.update_edge_properties(
source_id,
target_id,
relationship_type,
edge_properties
source_id, target_id, relationship_type, edge_properties
)
edges_updated += 1
else:
@ -436,15 +457,15 @@ async def add_frequency_weights(
)
# Store aggregate statistics as metadata if supported
if hasattr(graph_adapter, 'set_metadata'):
if hasattr(graph_adapter, "set_metadata"):
try:
metadata = {
'element_type_frequencies': usage_frequencies.get('element_type_frequencies', {}),
'total_interactions': usage_frequencies.get('total_interactions', 0),
'interactions_in_window': usage_frequencies.get('interactions_in_window', 0),
'last_frequency_update': usage_frequencies.get('last_processed_timestamp')
"element_type_frequencies": usage_frequencies.get("element_type_frequencies", {}),
"total_interactions": usage_frequencies.get("total_interactions", 0),
"interactions_in_window": usage_frequencies.get("interactions_in_window", 0),
"last_frequency_update": usage_frequencies.get("last_processed_timestamp"),
}
await graph_adapter.set_metadata('usage_frequency_stats', metadata)
await graph_adapter.set_metadata("usage_frequency_stats", metadata)
logger.info("Stored usage frequency statistics as metadata")
except Exception as e:
logger.warning(f"Could not store usage statistics as metadata: {e}")
@ -454,7 +475,7 @@ async def create_usage_frequency_pipeline(
graph_adapter: GraphDBInterface,
time_window: timedelta = timedelta(days=7),
min_interaction_threshold: int = 1,
batch_size: int = 100
batch_size: int = 100,
) -> tuple:
"""
Create memify pipeline entry for usage frequency tracking.
@ -486,7 +507,7 @@ async def create_usage_frequency_pipeline(
Task(
extract_usage_frequency,
time_window=time_window,
min_interaction_threshold=min_interaction_threshold
min_interaction_threshold=min_interaction_threshold,
)
]
@ -494,7 +515,7 @@ async def create_usage_frequency_pipeline(
Task(
add_frequency_weights,
graph_adapter=graph_adapter,
task_config={"batch_size": batch_size}
task_config={"batch_size": batch_size},
)
]
@ -505,7 +526,7 @@ async def run_usage_frequency_update(
graph_adapter: GraphDBInterface,
subgraphs: List[CogneeGraph],
time_window: timedelta = timedelta(days=7),
min_interaction_threshold: int = 1
min_interaction_threshold: int = 1,
) -> Dict[str, Any]:
"""
Convenience function to run the complete usage frequency update pipeline.
@ -543,13 +564,12 @@ async def run_usage_frequency_update(
usage_frequencies = await extract_usage_frequency(
subgraphs=subgraphs,
time_window=time_window,
min_interaction_threshold=min_interaction_threshold
min_interaction_threshold=min_interaction_threshold,
)
# Add frequency weights back to the graph
await add_frequency_weights(
graph_adapter=graph_adapter,
usage_frequencies=usage_frequencies
graph_adapter=graph_adapter, usage_frequencies=usage_frequencies
)
logger.info("Usage frequency update completed successfully")
@ -566,9 +586,7 @@ async def run_usage_frequency_update(
async def get_most_frequent_elements(
graph_adapter: GraphDBInterface,
top_n: int = 10,
element_type: Optional[str] = None
graph_adapter: GraphDBInterface, top_n: int = 10, element_type: Optional[str] = None
) -> List[Dict[str, Any]]:
"""
Retrieve the most frequently accessed graph elements.

View file

@ -23,8 +23,9 @@ try:
from cognee.tasks.memify.extract_usage_frequency import (
extract_usage_frequency,
add_frequency_weights,
run_usage_frequency_update
run_usage_frequency_update,
)
COGNEE_AVAILABLE = True
except ImportError:
COGNEE_AVAILABLE = False
@ -50,10 +51,10 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
id=f"interaction_{i}",
node_type="CogneeUserInteraction",
attributes={
'type': 'CogneeUserInteraction',
'query_text': f'Test query {i}',
'timestamp': int((current_time - timedelta(hours=i)).timestamp() * 1000)
}
"type": "CogneeUserInteraction",
"query_text": f"Test query {i}",
"timestamp": int((current_time - timedelta(hours=i)).timestamp() * 1000),
},
)
graph.add_node(interaction_node)
@ -62,10 +63,7 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
element_node = Node(
id=f"element_{i}",
node_type="DocumentChunk",
attributes={
'type': 'DocumentChunk',
'text': f'Element content {i}'
}
attributes={"type": "DocumentChunk", "text": f"Element content {i}"},
)
graph.add_node(element_node)
@ -78,7 +76,7 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
node1=graph.get_node(f"interaction_{i}"),
node2=graph.get_node(f"element_{element_idx}"),
edge_type="used_graph_element_to_answer",
attributes={'relationship_type': 'used_graph_element_to_answer'}
attributes={"relationship_type": "used_graph_element_to_answer"},
)
graph.add_edge(edge)
@ -89,15 +87,13 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
graph = self.create_mock_graph(num_interactions=3, num_elements=5)
result = await extract_usage_frequency(
subgraphs=[graph],
time_window=timedelta(days=7),
min_interaction_threshold=1
subgraphs=[graph], time_window=timedelta(days=7), min_interaction_threshold=1
)
self.assertIn('node_frequencies', result)
self.assertIn('total_interactions', result)
self.assertEqual(result['total_interactions'], 3)
self.assertGreater(len(result['node_frequencies']), 0)
self.assertIn("node_frequencies", result)
self.assertIn("total_interactions", result)
self.assertEqual(result["total_interactions"], 3)
self.assertGreater(len(result["node_frequencies"]), 0)
async def test_time_window_filtering(self):
"""Test that time window correctly filters old interactions."""
@ -110,9 +106,9 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
id="recent_interaction",
node_type="CogneeUserInteraction",
attributes={
'type': 'CogneeUserInteraction',
'timestamp': int(current_time.timestamp() * 1000)
}
"type": "CogneeUserInteraction",
"timestamp": int(current_time.timestamp() * 1000),
},
)
graph.add_node(recent_node)
@ -121,38 +117,44 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
id="old_interaction",
node_type="CogneeUserInteraction",
attributes={
'type': 'CogneeUserInteraction',
'timestamp': int((current_time - timedelta(days=10)).timestamp() * 1000)
}
"type": "CogneeUserInteraction",
"timestamp": int((current_time - timedelta(days=10)).timestamp() * 1000),
},
)
graph.add_node(old_node)
# Add element
element = Node(id="element_1", node_type="DocumentChunk", attributes={'type': 'DocumentChunk'})
element = Node(
id="element_1", node_type="DocumentChunk", attributes={"type": "DocumentChunk"}
)
graph.add_node(element)
# Add edges
graph.add_edge(Edge(
node1=recent_node, node2=element,
edge_type="used_graph_element_to_answer",
attributes={'relationship_type': 'used_graph_element_to_answer'}
))
graph.add_edge(Edge(
node1=old_node, node2=element,
edge_type="used_graph_element_to_answer",
attributes={'relationship_type': 'used_graph_element_to_answer'}
))
graph.add_edge(
Edge(
node1=recent_node,
node2=element,
edge_type="used_graph_element_to_answer",
attributes={"relationship_type": "used_graph_element_to_answer"},
)
)
graph.add_edge(
Edge(
node1=old_node,
node2=element,
edge_type="used_graph_element_to_answer",
attributes={"relationship_type": "used_graph_element_to_answer"},
)
)
# Extract with 7-day window
result = await extract_usage_frequency(
subgraphs=[graph],
time_window=timedelta(days=7),
min_interaction_threshold=1
subgraphs=[graph], time_window=timedelta(days=7), min_interaction_threshold=1
)
# Should only count recent interaction
self.assertEqual(result['interactions_in_window'], 1)
self.assertEqual(result['total_interactions'], 2)
self.assertEqual(result["interactions_in_window"], 1)
self.assertEqual(result["total_interactions"], 2)
async def test_threshold_filtering(self):
"""Test that minimum threshold filters low-frequency nodes."""
@ -160,13 +162,11 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
# Extract with threshold of 3
result = await extract_usage_frequency(
subgraphs=[graph],
time_window=timedelta(days=7),
min_interaction_threshold=3
subgraphs=[graph], time_window=timedelta(days=7), min_interaction_threshold=3
)
# Only nodes with 3+ accesses should be included
for node_id, freq in result['node_frequencies'].items():
for node_id, freq in result["node_frequencies"].items():
self.assertGreaterEqual(freq, 3)
async def test_element_type_tracking(self):
@ -178,49 +178,46 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
id="interaction_1",
node_type="CogneeUserInteraction",
attributes={
'type': 'CogneeUserInteraction',
'timestamp': int(datetime.now().timestamp() * 1000)
}
"type": "CogneeUserInteraction",
"timestamp": int(datetime.now().timestamp() * 1000),
},
)
graph.add_node(interaction)
# Create elements of different types
chunk = Node(id="chunk_1", node_type="DocumentChunk", attributes={'type': 'DocumentChunk'})
entity = Node(id="entity_1", node_type="Entity", attributes={'type': 'Entity'})
chunk = Node(id="chunk_1", node_type="DocumentChunk", attributes={"type": "DocumentChunk"})
entity = Node(id="entity_1", node_type="Entity", attributes={"type": "Entity"})
graph.add_node(chunk)
graph.add_node(entity)
# Add edges
for element in [chunk, entity]:
graph.add_edge(Edge(
node1=interaction, node2=element,
edge_type="used_graph_element_to_answer",
attributes={'relationship_type': 'used_graph_element_to_answer'}
))
graph.add_edge(
Edge(
node1=interaction,
node2=element,
edge_type="used_graph_element_to_answer",
attributes={"relationship_type": "used_graph_element_to_answer"},
)
)
result = await extract_usage_frequency(
subgraphs=[graph],
time_window=timedelta(days=7)
)
result = await extract_usage_frequency(subgraphs=[graph], time_window=timedelta(days=7))
# Check element types were tracked
self.assertIn('element_type_frequencies', result)
types = result['element_type_frequencies']
self.assertIn('DocumentChunk', types)
self.assertIn('Entity', types)
self.assertIn("element_type_frequencies", result)
types = result["element_type_frequencies"]
self.assertIn("DocumentChunk", types)
self.assertIn("Entity", types)
async def test_empty_graph(self):
"""Test handling of empty graph."""
graph = CogneeGraph()
result = await extract_usage_frequency(
subgraphs=[graph],
time_window=timedelta(days=7)
)
result = await extract_usage_frequency(subgraphs=[graph], time_window=timedelta(days=7))
self.assertEqual(result['total_interactions'], 0)
self.assertEqual(len(result['node_frequencies']), 0)
self.assertEqual(result["total_interactions"], 0)
self.assertEqual(len(result["node_frequencies"]), 0)
async def test_no_interactions_in_window(self):
"""Test handling when all interactions are outside time window."""
@ -232,19 +229,16 @@ class TestUsageFrequencyExtraction(unittest.TestCase):
id="old_interaction",
node_type="CogneeUserInteraction",
attributes={
'type': 'CogneeUserInteraction',
'timestamp': int(old_time.timestamp() * 1000)
}
"type": "CogneeUserInteraction",
"timestamp": int(old_time.timestamp() * 1000),
},
)
graph.add_node(old_interaction)
result = await extract_usage_frequency(
subgraphs=[graph],
time_window=timedelta(days=7)
)
result = await extract_usage_frequency(subgraphs=[graph], time_window=timedelta(days=7))
self.assertEqual(result['interactions_in_window'], 0)
self.assertEqual(result['total_interactions'], 1)
self.assertEqual(result["interactions_in_window"], 0)
self.assertEqual(result["total_interactions"], 1)
class TestIntegration(unittest.TestCase):
@ -266,6 +260,7 @@ class TestIntegration(unittest.TestCase):
# Test Runner
# ============================================================================
def run_async_test(test_func):
"""Helper to run async test functions."""
asyncio.run(test_func())

View file

@ -149,7 +149,9 @@ async def e2e_state():
vector_engine = get_vector_engine()
collection = await vector_engine.search(
collection_name="Triplet_text", query_text="Test", limit=None
collection_name="Triplet_text",
query_text="Test",
limit=None,
)
# --- Retriever contexts ---
@ -188,57 +190,70 @@ async def e2e_state():
query_type=SearchType.GRAPH_COMPLETION,
query_text="Where is germany located, next to which country?",
save_interaction=True,
verbose=True,
)
completion_cot = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION_COT,
query_text="What is the country next to germany??",
save_interaction=True,
verbose=True,
)
completion_ext = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION_CONTEXT_EXTENSION,
query_text="What is the name of the country next to germany",
save_interaction=True,
verbose=True,
)
await cognee.search(
query_type=SearchType.FEEDBACK, query_text="This was not the best answer", last_k=1
query_type=SearchType.FEEDBACK,
query_text="This was not the best answer",
last_k=1,
verbose=True,
)
completion_sum = await cognee.search(
query_type=SearchType.GRAPH_SUMMARY_COMPLETION,
query_text="Next to which country is Germany located?",
save_interaction=True,
verbose=True,
)
completion_triplet = await cognee.search(
query_type=SearchType.TRIPLET_COMPLETION,
query_text="Next to which country is Germany located?",
save_interaction=True,
verbose=True,
)
completion_chunks = await cognee.search(
query_type=SearchType.CHUNKS,
query_text="Germany",
save_interaction=False,
verbose=True,
)
completion_summaries = await cognee.search(
query_type=SearchType.SUMMARIES,
query_text="Germany",
save_interaction=False,
verbose=True,
)
completion_rag = await cognee.search(
query_type=SearchType.RAG_COMPLETION,
query_text="Next to which country is Germany located?",
save_interaction=False,
verbose=True,
)
completion_temporal = await cognee.search(
query_type=SearchType.TEMPORAL,
query_text="Next to which country is Germany located?",
save_interaction=False,
verbose=True,
)
await cognee.search(
query_type=SearchType.FEEDBACK,
query_text="This answer was great",
last_k=1,
verbose=True,
)
# Snapshot after all E2E operations above (used by assertion-only tests).

View file

@ -129,14 +129,32 @@ async def test_search_access_control_returns_dataset_shaped_dicts(monkeypatch, s
monkeypatch.setattr(search_mod, "backend_access_control_enabled", lambda: True)
monkeypatch.setattr(search_mod, "authorized_search", dummy_authorized_search)
out = await search_mod.search(
out_non_verbose = await search_mod.search(
query_text="q",
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=False,
)
assert out == [
assert out_non_verbose == [
{
"search_result": ["r"],
"dataset_id": ds.id,
"dataset_name": "ds1",
"dataset_tenant_id": "t1",
}
]
out_verbose = await search_mod.search(
query_text="q",
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=True,
)
assert out_verbose == [
{
"search_result": ["r"],
"dataset_id": ds.id,
@ -166,6 +184,7 @@ async def test_search_access_control_only_context_returns_dataset_shaped_dicts(
dataset_ids=[ds.id],
user=user,
only_context=True,
verbose=True,
)
assert out == [

View file

@ -90,6 +90,7 @@ async def test_search_access_control_edges_context_produces_graphs_and_context_m
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=True,
)
assert out[0]["dataset_name"] == "ds1"
@ -126,6 +127,7 @@ async def test_search_access_control_insights_context_produces_graphs_and_null_r
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=True,
)
assert out[0]["graphs"] is not None
@ -150,6 +152,7 @@ async def test_search_access_control_only_context_returns_context_text_map(monke
dataset_ids=[ds.id],
user=user,
only_context=True,
verbose=True,
)
assert out[0]["search_result"] == [{"ds1": "a\nb"}]
@ -172,6 +175,7 @@ async def test_search_access_control_results_edges_become_graph_result(monkeypat
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=True,
)
assert isinstance(out[0]["search_result"][0], dict)
@ -195,6 +199,7 @@ async def test_search_use_combined_context_defaults_empty_datasets(monkeypatch,
dataset_ids=None,
user=user,
use_combined_context=True,
verbose=True,
)
assert out.result == "answer"
@ -219,6 +224,7 @@ async def test_search_access_control_context_str_branch(monkeypatch, search_mod)
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=True,
)
assert out[0]["graphs"] is None
@ -242,6 +248,7 @@ async def test_search_access_control_context_empty_list_branch(monkeypatch, sear
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=True,
)
assert out[0]["graphs"] is None
@ -265,6 +272,7 @@ async def test_search_access_control_multiple_results_list_branch(monkeypatch, s
query_type=SearchType.CHUNKS,
dataset_ids=[ds.id],
user=user,
verbose=True,
)
assert out[0]["search_result"] == [["r1", "r2"]]
@ -293,4 +301,5 @@ async def test_search_access_control_defaults_empty_datasets(monkeypatch, search
query_type=SearchType.CHUNKS,
dataset_ids=None,
user=user,
verbose=True,
)

View file

@ -20,19 +20,29 @@ echo "HTTP port: $HTTP_PORT"
# smooth redeployments and container restarts while maintaining data integrity.
echo "Running database migrations..."
set +e # Disable exit on error to handle specific migration errors
MIGRATION_OUTPUT=$(alembic upgrade head)
MIGRATION_EXIT_CODE=$?
set -e
if [[ $MIGRATION_EXIT_CODE -ne 0 ]]; then
if [[ "$MIGRATION_OUTPUT" == *"UserAlreadyExists"* ]] || [[ "$MIGRATION_OUTPUT" == *"User default_user@example.com already exists"* ]]; then
echo "Warning: Default user already exists, continuing startup..."
else
echo "Migration failed with unexpected error."
exit 1
fi
fi
echo "Migration failed with unexpected error. Trying to run Cognee without migrations."
echo "Database migrations done."
echo "Initializing database tables..."
python /app/cognee/modules/engine/operations/setup.py
INIT_EXIT_CODE=$?
if [[ $INIT_EXIT_CODE -ne 0 ]]; then
echo "Database initialization failed!"
exit 1
fi
fi
else
echo "Database migrations done."
fi
echo "Starting server..."

View file

@ -1,8 +1,9 @@
import asyncio
import cognee
import os
from pprint import pprint
# By default cognee uses OpenAI's gpt-5-mini LLM model
# Provide your OpenAI LLM API KEY
os.environ["LLM_API_KEY"] = ""
@ -24,13 +25,13 @@ async def cognee_demo():
# Query Cognee for information from provided document
answer = await cognee.search("List me all the important characters in Alice in Wonderland.")
print(answer)
pprint(answer)
answer = await cognee.search("How did Alice end up in Wonderland?")
print(answer)
pprint(answer)
answer = await cognee.search("Tell me about Alice's personality.")
print(answer)
pprint(answer)
# Cognee is an async library, it has to be called in an async context

View file

@ -1,4 +1,5 @@
import asyncio
from pprint import pprint
import cognee
from cognee.api.v1.search import SearchType
@ -187,7 +188,7 @@ async def main(enable_steps):
search_results = await cognee.search(
query_type=SearchType.GRAPH_COMPLETION, query_text="Who has experience in design tools?"
)
print(search_results)
pprint(search_results)
if __name__ == "__main__":

View file

@ -39,6 +39,7 @@ load_dotenv()
# STEP 1: Setup and Configuration
# ============================================================================
async def setup_knowledge_base():
"""
Create a fresh knowledge base with sample content.
@ -104,6 +105,7 @@ async def setup_knowledge_base():
# STEP 2: Simulate User Searches with Interaction Tracking
# ============================================================================
async def simulate_user_searches(queries: List[str]):
"""
Simulate users searching the knowledge base.
@ -131,7 +133,7 @@ async def simulate_user_searches(queries: List[str]):
query_type=SearchType.GRAPH_COMPLETION,
query_text=query,
save_interaction=True, # ← THIS IS CRITICAL!
top_k=5
top_k=5,
)
successful_searches += 1
@ -152,9 +154,9 @@ async def simulate_user_searches(queries: List[str]):
# STEP 3: Extract and Apply Usage Frequencies
# ============================================================================
async def extract_and_apply_frequencies(
time_window_days: int = 7,
min_threshold: int = 1
time_window_days: int = 7, min_threshold: int = 1
) -> Dict[str, Any]:
"""
Extract usage frequencies from interactions and apply them to the graph.
@ -184,8 +186,14 @@ async def extract_and_apply_frequencies(
await graph.project_graph_from_db(
adapter=graph_engine,
node_properties_to_project=[
"type", "node_type", "timestamp", "created_at",
"text", "name", "query_text", "frequency_weight"
"type",
"node_type",
"timestamp",
"created_at",
"text",
"name",
"query_text",
"frequency_weight",
],
edge_properties_to_project=["relationship_type", "timestamp"],
directed=True,
@ -195,9 +203,10 @@ async def extract_and_apply_frequencies(
# Count interaction nodes
interaction_nodes = [
n for n in graph.nodes.values()
if n.attributes.get('type') == 'CogneeUserInteraction' or
n.attributes.get('node_type') == 'CogneeUserInteraction'
n
for n in graph.nodes.values()
if n.attributes.get("type") == "CogneeUserInteraction"
or n.attributes.get("node_type") == "CogneeUserInteraction"
]
print(f"✓ Found {len(interaction_nodes)} interaction nodes")
@ -207,11 +216,13 @@ async def extract_and_apply_frequencies(
graph_adapter=graph_engine,
subgraphs=[graph],
time_window=timedelta(days=time_window_days),
min_interaction_threshold=min_threshold
min_interaction_threshold=min_threshold,
)
print(f"\n✓ Frequency extraction complete!")
print(f" - Interactions processed: {stats['interactions_in_window']}/{stats['total_interactions']}")
print("\n✓ Frequency extraction complete!")
print(
f" - Interactions processed: {stats['interactions_in_window']}/{stats['total_interactions']}"
)
print(f" - Nodes weighted: {len(stats['node_frequencies'])}")
print(f" - Element types tracked: {stats.get('element_type_frequencies', {})}")
@ -224,6 +235,7 @@ async def extract_and_apply_frequencies(
# STEP 4: Analyze and Display Results
# ============================================================================
async def analyze_results(stats: Dict[str, Any]):
"""
Analyze and display the frequency tracking results.
@ -241,15 +253,11 @@ async def analyze_results(stats: Dict[str, Any]):
print("=" * 80)
# Display top nodes by frequency
if stats['node_frequencies']:
if stats["node_frequencies"]:
print("\n📊 Top 10 Most Frequently Accessed Elements:")
print("-" * 80)
sorted_nodes = sorted(
stats['node_frequencies'].items(),
key=lambda x: x[1],
reverse=True
)
sorted_nodes = sorted(stats["node_frequencies"].items(), key=lambda x: x[1], reverse=True)
# Get graph to display node details
graph_engine = await get_graph_engine()
@ -264,8 +272,8 @@ async def analyze_results(stats: Dict[str, Any]):
for i, (node_id, frequency) in enumerate(sorted_nodes[:10], 1):
node = graph.get_node(node_id)
if node:
node_type = node.attributes.get('type', 'Unknown')
text = node.attributes.get('text') or node.attributes.get('name') or ''
node_type = node.attributes.get("type", "Unknown")
text = node.attributes.get("text") or node.attributes.get("name") or ""
text_preview = text[:60] + "..." if len(text) > 60 else text
print(f"\n{i}. Frequency: {frequency} accesses")
@ -276,10 +284,10 @@ async def analyze_results(stats: Dict[str, Any]):
print(f" Node ID: {node_id[:50]}...")
# Display element type distribution
if stats.get('element_type_frequencies'):
if stats.get("element_type_frequencies"):
print("\n\n📈 Element Type Distribution:")
print("-" * 80)
type_dist = stats['element_type_frequencies']
type_dist = stats["element_type_frequencies"]
for elem_type, count in sorted(type_dist.items(), key=lambda x: x[1], reverse=True):
print(f" {elem_type}: {count} accesses")
@ -290,7 +298,7 @@ async def analyze_results(stats: Dict[str, Any]):
graph_engine = await get_graph_engine()
adapter_type = type(graph_engine).__name__
if adapter_type == 'Neo4jAdapter':
if adapter_type == "Neo4jAdapter":
try:
result = await graph_engine.query("""
MATCH (n)
@ -298,7 +306,7 @@ async def analyze_results(stats: Dict[str, Any]):
RETURN count(n) as weighted_count
""")
count = result[0]['weighted_count'] if result else 0
count = result[0]["weighted_count"] if result else 0
if count > 0:
print(f"{count} nodes have frequency_weight in Neo4j database")
@ -328,6 +336,7 @@ async def analyze_results(stats: Dict[str, Any]):
# STEP 5: Demonstrate Usage in Retrieval
# ============================================================================
async def demonstrate_retrieval_usage():
"""
Demonstrate how frequency weights can be used in retrieval.
@ -379,6 +388,7 @@ async def demonstrate_retrieval_usage():
# MAIN: Run Complete Example
# ============================================================================
async def main():
"""
Run the complete end-to-end usage frequency tracking example.
@ -398,7 +408,7 @@ async def main():
print(f" LLM Provider: {os.getenv('LLM_PROVIDER')}")
# Verify LLM key is set
if not os.getenv('LLM_API_KEY') or os.getenv('LLM_API_KEY') == 'sk-your-key-here':
if not os.getenv("LLM_API_KEY") or os.getenv("LLM_API_KEY") == "sk-your-key-here":
print("\n⚠ WARNING: LLM_API_KEY not set in .env file")
print(" Set your API key to run searches")
return
@ -430,10 +440,7 @@ async def main():
return
# Step 3: Extract frequencies
stats = await extract_and_apply_frequencies(
time_window_days=7,
min_threshold=1
)
stats = await extract_and_apply_frequencies(time_window_days=7, min_threshold=1)
# Step 4: Analyze results
await analyze_results(stats)
@ -451,7 +458,7 @@ async def main():
print("\n")
print("Summary:")
print(f" ✓ Documents added: 4")
print(" ✓ Documents added: 4")
print(f" ✓ Searches performed: {successful_searches}")
print(f" ✓ Interactions tracked: {stats['interactions_in_window']}")
print(f" ✓ Nodes weighted: {len(stats['node_frequencies'])}")
@ -467,6 +474,7 @@ async def main():
except Exception as e:
print(f"\n✗ Example failed: {e}")
import traceback
traceback.print_exc()

View file

@ -1,6 +1,8 @@
import os
import asyncio
import pathlib
from pprint import pprint
from cognee.shared.logging_utils import setup_logging, ERROR
import cognee
@ -42,7 +44,7 @@ async def main():
# Display search results
for result_text in search_results:
print(result_text)
pprint(result_text)
if __name__ == "__main__":

View file

@ -1,5 +1,6 @@
import asyncio
import os
from pprint import pprint
import cognee
from cognee.api.v1.search import SearchType
@ -77,7 +78,7 @@ async def main():
query_type=SearchType.GRAPH_COMPLETION,
query_text="What are the exact cars and their types produced by Audi?",
)
print(search_results)
pprint(search_results)
await visualize_graph()

View file

@ -1,6 +1,7 @@
import os
import cognee
import pathlib
from pprint import pprint
from cognee.modules.users.exceptions import PermissionDeniedError
from cognee.modules.users.tenants.methods import select_tenant
@ -86,7 +87,7 @@ async def main():
)
print("\nSearch results as user_1 on dataset owned by user_1:")
for result in search_results:
print(f"{result}\n")
pprint(result)
# But user_1 cant read the dataset owned by user_2 (QUANTUM dataset)
print("\nSearch result as user_1 on the dataset owned by user_2:")
@ -134,7 +135,7 @@ async def main():
dataset_ids=[quantum_dataset_id],
)
for result in search_results:
print(f"{result}\n")
pprint(result)
# If we'd like for user_1 to add new documents to the QUANTUM dataset owned by user_2, user_1 would have to get
# "write" access permission, which user_1 currently does not have
@ -217,7 +218,7 @@ async def main():
dataset_ids=[quantum_cognee_lab_dataset_id],
)
for result in search_results:
print(f"{result}\n")
pprint(result)
# Note: All of these function calls and permission system is available through our backend endpoints as well

View file

@ -1,4 +1,6 @@
import asyncio
from pprint import pprint
import cognee
from cognee.modules.engine.operations.setup import setup
from cognee.modules.users.methods import get_default_user
@ -71,7 +73,7 @@ async def main():
print("Search results:")
# Display results
for result_text in search_results:
print(result_text)
pprint(result_text)
if __name__ == "__main__":

View file

@ -1,4 +1,6 @@
import asyncio
from pprint import pprint
import cognee
from cognee.shared.logging_utils import setup_logging, ERROR
from cognee.api.v1.search import SearchType
@ -54,7 +56,7 @@ async def main():
print("Search results:")
# Display results
for result_text in search_results:
print(result_text)
pprint(result_text)
if __name__ == "__main__":

View file

@ -1,4 +1,5 @@
import asyncio
from pprint import pprint
import cognee
from cognee.shared.logging_utils import setup_logging, INFO
from cognee.api.v1.search import SearchType
@ -87,7 +88,8 @@ async def main():
top_k=15,
)
print(f"Query: {query_text}")
print(f"Results: {search_results}\n")
print("Results:")
pprint(search_results)
if __name__ == "__main__":

View file

@ -1,4 +1,5 @@
import asyncio
from pprint import pprint
import cognee
from cognee.memify_pipelines.create_triplet_embeddings import create_triplet_embeddings
@ -65,7 +66,7 @@ async def main():
query_type=SearchType.TRIPLET_COMPLETION,
query_text="What are the models produced by Volkswagen based on the context?",
)
print(search_results)
pprint(search_results)
if __name__ == "__main__":

View file

@ -1,7 +1,7 @@
[project]
name = "cognee"
version = "0.5.1.dev0"
version = "0.5.1"
description = "Cognee - is a library for enriching LLM context with a semantic layer for better understanding and reasoning."
authors = [
{ name = "Vasilije Markovic" },

9461
uv.lock generated

File diff suppressed because it is too large Load diff