Commit graph

68 commits

Author SHA1 Message Date
Vasilije
bb7eaa017b
feat: Group DataPoints into NodeSets (#680)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2025-04-19 20:21:04 +02:00
Vasilije
8374e402a8
fix: Clean up pokemons (#746)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-04-19 10:51:51 +02:00
Daniel Molnar
9ba12b25ef
feat: add delete by document (#668)
<!-- .github/pull_request_template.md -->

## Description
Delete by document.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-04-17 15:42:10 +02:00
lxobr
d1eab97102
feature: tighten run_tasks_base (#730)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Extracted run_tasks_base function into a new file run_tasks_base.py.
- Extracted four executors that execute core logic based on the task
type.
- Extracted a task handler/wrapper that safely executes the core logic
with logging and telemetry.
- Fixed the inconsistency with the batches of size 1.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-04-16 09:19:03 +02:00
lxobr
e12242b9d0
fix: get default tasks (#700)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Fixed get_no_summary_tasks and get_just_chunks_tasks to work with the
new tasks and pipelines
- Chore: fixed the pokemon example to work with the new tasks and
pipelines

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-04-07 08:46:02 +02:00
Igor Ilic
b618e97f98
chore: Remove outdated nodejs example, add specific versioning for mcp (#698)
<!-- .github/pull_request_template.md -->

## Description
Remove outdated nodejs example, add specific versioning for mcp

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-04-03 10:53:44 +02:00
Igor Ilic
6898e8f766
Fix codify mcp (#696)
<!-- .github/pull_request_template.md -->

## Description
- Redirect all Cognee output to stderr for MCP ( as stdout is used to
communicate between MCP Client and server )
- Add test for CODE search type
- Resolve missing optional GUI dependency

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-04-02 06:38:17 +02:00
Boris
ebf1f81b35
fix: code cleanup [COG-781] (#667)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-03-26 18:32:43 +01:00
Igor Ilic
9f587a01a4
feat: Relational db to graph db [COG-1468] (#644)
<!-- .github/pull_request_template.md -->

## Description
Add ability to migrate relational database to graph database

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-03-26 11:40:06 +01:00
hajdul88
897a1f3081
Feat: Adds ontology scientific paper demo (#662)
<!-- .github/pull_request_template.md -->

## Description
Adds ontology demo 2

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-03-25 17:21:29 +01:00
Daniel Molnar
73db1a5a53
fix: human readable logs (#658)
<!-- .github/pull_request_template.md -->

## Description
Introducing scructlog.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-03-25 11:54:40 +01:00
Igor Ilic
9b9fe48843
chore: Temporarily remove embedding env vars for code graph action (#647)
<!-- .github/pull_request_template.md -->

## Description
Temporarily remove embedding env variables for code graph action so the
action can run

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Removed legacy secret configuration from the testing workflow to
streamline the CI process and enhance maintainability.
- **Improvements**
  - Updated the argument name in the code graph pipeline for clarity.
- Enhanced the handling of results in the example script to support
asynchronous processing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-17 14:58:03 +01:00
hajdul88
6fcfb3c398
feat: productionizing ontology solution [COG-1401] (#623)
<!-- .github/pull_request_template.md -->

## Description
This PR contains the ontology feature integrated into cognify

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced ontology management with the introduction of the
`OntologyResolver` class for improved data handling and querying.
- Expanded ontology framework now provides enriched coverage of
technology and automotive domains, including new entities and
relationships.
- Updated entity models now include a validation flag to support
improved data integrity.
- Added support for specifying an ontology file path in relevant
functions to enhance flexibility.

- **Refactor**
- Streamlined integration of ontology processing across data extraction
and workflow routines.

- **Chores**
- Updated project dependencies to include `owlready2` for advanced
ontology functionality.
  
- **Tests**
- Introduced a new test suite for the `OntologyResolver` class to
validate its functionality under various conditions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-12 14:31:19 +01:00
Boris Arzentar
2e4aab9a9a fix: example ruff errors 2025-03-11 16:44:00 +01:00
hibajamal
56427f287e
Demo for relational db with cognee (#620)
<!-- .github/pull_request_template.md -->

## Description
This demo uses pydantic models and dlt to pull data from the Pokémon API
and structure it into a relational format. By feeding this structured
data into cognee, it makes searching across multiple tables easier and
more intuitive, thanks to the relational model.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive Pokémon data processing pipeline, available
as both a Python script and an interactive Jupyter Notebook.
- Enabled asynchronous operations for efficient data collection and
querying, including an integrated search functionality.
- Improved error handling and data validation during the data fetching
and processing stages for a smoother user experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-03-08 20:33:42 +01:00
Boris
5345626e6a
fix: add proper node labels (#607)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Improved backend data organization with automatic categorization of
stored items for enhanced search and retrieval.
- Launched a product recommendation system that analyzes customer data
and preferences to suggest top products.
- Introduced a sample dataset showcasing customer profiles, preferences,
and product interactions for demonstration purposes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-06 13:30:13 +01:00
lxobr
f033f733b5
feat: entity brute force triplet search [COG-1325] (#589)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Refactored `brute_force_triplet_search`, extracting memory projection.
- Built **TripletSearchContextProvider** (extends
**BaseContextProvider**) to create a single memory projection and
perform a triplet search for each entity.
- Refactored `entity_completion` into **EntityCompletionRetriever**
(extends **BaseRetriever**).
- Added **SummarizedTripletSearchContextProvider** (extends
**TripletSearchContextProvider**) for an alternative summarized output
format.
- Developed and tested an example showcasing both context providers,
comparing raw triplets, summaries, and standard search results.
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced text summarization now delivers clearer, more concise
overviews of search results.
- Improved search performance with optimized context retrieval and
memory reuse for faster, more reliable results.
- Introduced advanced entity-based completion for generating more
relevant, context-aware responses.

- **Refactor**
- Streamlined internal workflows and error handling to ensure a smoother
overall experience.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-03-05 11:17:58 +01:00
alekszievr
6d7a68dbba
Feat: Store descriptive metrics identified by pipeline run id [cog-1260] (#582)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new analytic capability that calculates descriptive graph
metrics for pipeline runs when enabled.
- Updated the execution flow to include an option for activating the
graph metrics step.

- **Chores**
- Removed the previous mechanism for storing descriptive metrics to
streamline the system.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-03-03 19:09:35 +01:00
Daniel Molnar
d27f847753
Transition to new retrievers, update searches (#585)
<!-- .github/pull_request_template.md -->

## Description
Delete legacy search implementations after migrating to new retriever
classes

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced search and retrieval capabilities, providing improved context
resolution for code queries, completions, summaries, and graph
connections.
  
- **Refactor**
- Shifted to a modular, object-oriented approach that consolidates query
logic and streamlines error management for a more robust and scalable
experience.
  
- **Bug Fixes**
- Improved error handling for unsupported search types and retrieval
operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-27 15:25:24 +01:00
lxobr
9cc357ac1c
Feat/cog 1365 unify retrievers (#572)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Created the `BaseRetriever` class to unify all the retrievers and
searches.
- Implemented seven specialized retrievers (summaries, chunks,
completions, graph, graph-summary, insights, code) with consistent
get_context/get_completion interfaces.
- Added json context dumping feature in the current completion
implementations to enable context comparisons.
- Built a comparison framework to validate old vs new implementations.
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced multiple retrieval classes for enhanced search
capabilities, including `BaseRetriever`, `ChunksRetriever`,
`CodeRetriever`, `CompletionRetriever`, `GraphCompletionRetriever`,
`GraphSummaryCompletionRetriever`, `InsightsRetriever`, and
`SummariesRetriever`.
- Enhanced query completions with optional context saving for improved
data persistence.
- Implemented advanced tools to compare retrieval outcomes across
different implementations.

- **Refactor**
- Streamlined internal module organization and updated references for
increased maintainability and consistency.
- Added comments indicating future maintenance tasks related to code
merging.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-02-27 12:13:21 +01:00
Boris
ada466879e
fix: add default params to run_tasks (#563)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced the task execution process by enabling default values for
certain parameters, allowing users to trigger task processing without
supplying every input explicitly.
  
- **Bug Fixes**
- Adjusted asynchronous handling for the `retrieved_edges_to_string`
function to ensure proper execution flow in various components.

- **Documentation**
- Updated markdown formatting in the Jupyter notebook for improved
readability and structure.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-02-19 20:18:51 +01:00
alekszievr
05ba29af01
Feat: log pipeline status and pass it through pipeline [COG-1214] (#501)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced pipeline execution now provides consolidated status feedback
with improved telemetry for start, completion, and error events.
- Automatic generation of unique dataset identifiers offers clearer task
and pipeline run associations.

- **Refactor**
- Task execution has been streamlined with explicit parameter handling
for more structured pipeline processing.
- Interactive examples and demos now return results directly, making
integration and monitoring more accessible.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2025-02-11 16:41:40 +01:00
Igor Ilic
3850e9c7a1
Cognee simple document example (#521)
<!-- .github/pull_request_template.md -->

## Description
Notebook and python example for cognee simple example

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced an interactive demo showcasing asynchronous document
processing and querying for key insights from a sample text.
- **Documentation**
- Added an in-depth, step-by-step guide in a Jupyter Notebook that walks
users through setup, configuration, querying, and visualizing processed
data.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-11 13:58:35 +01:00
Igor Ilic
5fe7ff9883
refactor: Refactor search so graph completion is used by default (#505)
<!-- .github/pull_request_template.md -->

## Description
Refactor search so query type doesn't need to be provided to make it
simpler for new users

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Improved the search interface by standardizing parameter usage with
explicit keyword arguments for specifying search types, enhancing
clarity and consistency.
- **Tests**
- Updated test cases and example integrations to align with the revised
search parameters, ensuring consistent behavior and reliable validation
of search outcomes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-07 17:16:34 +01:00
hajdul88
08c22a542a fix: fixes typo in multimedia example 2025-01-17 09:31:48 +01:00
hajdul88
981f35c1e0 fix: fixes windows compatibility in examples 2025-01-17 09:28:10 +01:00
hajdul88
bd6aafe9b7 fix: fixes event loop handling on windows in dynamic steps example 2025-01-16 18:17:11 +01:00
hajdul88
1db44de7de feat: adds graphiti demo notebook 2025-01-15 11:45:06 +01:00
hajdul88
d0646a1694 feat: Implements generation and retrieval and adjusts imports 2025-01-14 13:59:27 +01:00
hajdul88
c351047c36 feat: adds cognee node and edge embeddings for graphiti graph 2025-01-13 17:22:59 +01:00
hajdul88
ea8628c527 Fix: Fixes logging setup 2025-01-13 09:49:56 +01:00
Rita Aleksziev
a11b914f39 Merge branch 'dev' into COG-949 2025-01-10 10:02:56 +01:00
hajdul88
341f30fcdc fix: Fixes ruff formatting 2025-01-09 12:00:49 +01:00
hajdul88
fe57eb69e7
Merge branch 'dev' into feature/cog-967-adding-graph-completion-feature-to-cognee 2025-01-09 11:07:19 +01:00
Rita Aleksziev
5635da6e38 Adjust unit tests 2025-01-09 10:53:03 +01:00
hajdul88
d39140f28b feat: implements the first version of graph based completion in search 2025-01-08 16:10:29 +01:00
vasilije
41b1486cff Fix visualization 2025-01-08 13:13:52 +01:00
Rita Aleksziev
f4397bf940 Remove setting envvars from arg 2025-01-08 12:33:14 +01:00
Rita Aleksziev
8ffef5034a Add clean logging to code graph example 2025-01-08 12:25:31 +01:00
hajdul88
18c8bc3c33
Merge branch 'dev' into COG-adding_html_graph_render 2025-01-08 10:44:11 +01:00
alekszievr
0dec704445
Merge branch 'dev' into COG-949 2025-01-08 10:21:07 +01:00
hajdul88
58da2d9e57 fix: Fixes faulty logging format and sets up error logging in dynamic steps example 2025-01-07 11:01:37 +01:00
lxobr
5e79dc53c5 feat: time code graph run and add mock support 2025-01-06 11:25:04 +01:00
vasilije
60c8fd103b ruff format 2025-01-05 19:09:08 +01:00
lxobr
262deee26e
Cog 813 source code chunks (#383)
* fix: pass the list of all CodeFiles to enrichment task

* feat: introduce SourceCodeChunk, update metadata

* feat: get_source_code_chunks code graph pipeline task

* feat: integrate get_source_code_chunks task, comment out summarize_code

* Fix code summarization (#387)

* feat: update data models

* feat: naive parse long strings in source code

* fix: get_non_py_files instead of get_non_code_files

* fix: limit recursion, add comment

* handle embedding empty input error (#398)

* feat: robustly handle CodeFile source code

* refactor: sort imports

* todo: add support for other embedding models

* feat: add custom logger

* feat: add robustness to get_source_code_chunks

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* feat: improve embedding exceptions

* refactor: format indents, rename module

---------

Co-authored-by: alekszievr <44192193+alekszievr@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2024-12-26 13:53:38 +01:00
alekszievr
de2394c392
Ingest non-code files (#395)
* Ingest non-code files

* Fixing review findings
2024-12-20 14:06:40 +01:00
lxobr
da5e3ab24d
COG 870 Remove duplicate edges from the code graph (#293)
* feat: turn summarize_code into generator

* feat: extract run_code_graph_pipeline, update the pipeline

* feat: minimal code graph example

* refactor: update argument

* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline

* refactor: indentation and whitespace nits

* refactor: add deprecated use comments and warnings

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2024-12-17 12:02:25 +01:00
hajdul88
68c3f42ab8
Merge branch 'main' into feature/cog-717-create-edge-embeddings-in-vector-databases 2024-12-05 09:08:37 +01:00
Rita Aleksziev
dd94781033 Integrate graphiti's functionality as Tasks 2024-12-04 16:33:26 +01:00
hajdul88
46ee513f6c chore: deletes comment from dynamic_steps_example 2024-12-04 14:59:01 +01:00