Commit graph

2090 commits

Author SHA1 Message Date
hajdul88
bcd326518d
feat: implements graph visualization method for cognee (#493)
<!-- .github/pull_request_template.md -->

## Description
This PR contains the improvement of the visualization endpoint

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Launched an enhanced interactive network visualization utility that
renders dynamic, browser-based graphs. The new feature simplifies
execution by directly generating an HTML file showcasing the
visualization—complete with interactive elements and an on-screen
confirmation—providing a more intuitive and efficient experience.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-06 11:22:17 +01:00
Igor Ilic
d56fd8d925
chore: Change Code graph gh action to use OpenAI API (#499)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated the CI configuration for integration tests to use revised
secret values, ensuring improved alignment with current external API
credential requirements and deprecating legacy references.
- Made several secrets optional in the workflow, enhancing flexibility
during execution.
- Removed several outdated secrets from multiple workflows, streamlining
the configuration.
- Improved error handling in the code processing logic by adding
exception management for `AttributeError` and `AssertionError`.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-05 22:20:07 +01:00
Boris Arzentar
6303501637 fix: use cognee from pypi in mcp server 2025-02-05 18:51:07 +01:00
Boris Arzentar
79a1b86161 version: v0.1.24 2025-02-05 17:53:27 +01:00
Igor Ilic
df163b0431
Add pydantic settings checker (#497)
<!-- .github/pull_request_template.md -->

## Description
Add test of embedding and LLM model at beginning of cognee use
Fix issue with relational database async use
Refactor handling of cache mechanism for all databases so changes in
config can be reflected in get functions

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced connection testing for language and embedding services at
startup, ensuring improved reliability during data addition.
  
- **Refactor**
- Streamlined engine initialization across multiple database systems to
enhance performance and clarity.
- Improved parameter handling and caching strategies for faster, more
consistent operations.
  - Updated record identifiers for more robust and unique data storage.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: holchan <61059652+holchan@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-02-04 23:18:27 +01:00
Hande
690d028928
fix: mcp CODE search error (#495)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced search functionality now returns JSON formatted results for
code-based queries for clearer response presentation.
  
- **Refactor**
- Updated the search logic to differentiate result handling based on
query type, maintaining string output for non-code queries.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-04 19:32:04 +01:00
Igor Ilic
1260fc7db0
fix: Add reraising of general exception handling in cognee [COG-1062] (#490)
<!-- .github/pull_request_template.md -->

## Description
Add re-raising of errors in general exception handling 

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes & Stability Improvements**
- Enhanced error handling throughout the system, ensuring issues during
operations like server startup, data processing, and graph management
are properly logged and reported.

- **Refactor**
- Standardized logging practices replace basic output statements,
improving traceability and providing better insights for
troubleshooting.

- **New Features**
- Updated search functionality now returns only unique results,
enhancing data consistency and the overall user experience.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: holchan <61059652+holchan@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-02-04 10:51:05 +01:00
Vasilije
4d3acc358a
fix: mcp improvements (#472)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Dependency Update**
	- Downgraded `mcp` package version from 1.2.0 to 1.1.3
- Updated `cognee` dependency to include additional features with
`cognee[codegraph]`

- **New Features**
- Introduced a new tool, "codify", for transforming codebases into
knowledge graphs
- Enhanced the existing "search" tool to accept a new parameter for
search type

- **Improvements**
	- Streamlined search functionality with a new modular approach
- Added new asynchronous function for retrieving and formatting code
parts

- **Documentation**
- Updated import paths for `SearchType` in various modules and tests to
reflect structural changes

- **Code Cleanup**
	- Removed legacy search module and associated classes/functions
	- Refined data transfer object classes for consistency and clarity
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2025-02-04 08:47:31 +01:00
alekszievr
2858a674f5
feat: Calculate graph metrics for networkx graph [COG-1082] (#484)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enabled an option to retrieve more detailed metrics, providing
comprehensive analytics for graph and descriptive data.

- **Refactor**
- Standardized the way metrics are obtained across components for
consistent behavior and improved data accuracy.
  
- **Chore**
- Made internal enhancements to support optional detailed metric
calculations, streamlining system performance and ensuring future
scalability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-02-03 18:05:53 +01:00
alekszievr
5119992fd8
feat: Add graph metrics getter in graph db interface and adapters [COG-1082] (#483)
Dummy implementation of graph metrics to demonstrate how the interface
will look like

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced asynchronous functionality for retrieving comprehensive
graph metrics, including counts and connectivity details, across
different systems.
  
- **Refactor**
- Streamlined metrics processing and storage by shifting to direct
retrieval from the graph engine.
- Updated naming conventions for the `GraphMetrics` database table and
reorganized module imports to enhance internal consistency.
  
- **Chores**
- Removed dataset deletion functionalities while introducing the ability
to store descriptive metrics.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-02-03 15:25:04 +01:00
Boris Arzentar
44a4f8fd0d version: v0.1.23 2025-02-01 15:23:43 +01:00
hajdul88
2fd6bfa44c
feat: implement unit tests and extensive checks around the get_graph_from_model [COG-754] (#491)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **Tests**
  - Added comprehensive unit tests for graph model generation
- Introduced new test scenarios covering various data structures and
edge cases
  - Implemented tests for document, chunk, and entity relationships

- **Chores**
- Updated continuous deployment workflow to trigger only on `dev` branch

The release focuses on improving test coverage and refining the
deployment process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-01-31 18:17:23 +01:00
Igor Ilic
8879f3fbbe
feat: Add gemini support [COG-1023] (#485)
<!-- .github/pull_request_template.md -->

## Description
PR to test Gemini PR from holchan

1. Add Gemini LLM and Gemini Embedding support 
2. Fix CodeGraph issue with chunks being bigger than maximum token value
3. Add Tokenizer adapters to CodeGraph

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
    - Added support for the Gemini LLM provider.
    - Expanded LLM configuration options.
- Introduced a new GitHub Actions workflow for multimetric QA
evaluation.
- Added new environment variables for LLM and embedding configurations
across various workflows.

- **Bug Fixes**
    - Improved error handling in various components.
    - Updated tokenization and embedding processes.
    - Removed warning related to missing `dict` method in data items.

- **Refactor**
    - Simplified token extraction and decoding methods.
    - Updated tokenizer interfaces.
    - Removed deprecated dependencies.
    - Enhanced retry logic and error handling in embedding processes.

- **Documentation**
    - Updated configuration comments and settings.

- **Chores**
- Updated GitHub Actions workflows to accommodate new secrets and
environment variables.
    - Modified evaluation parameters.
    - Adjusted dependency management for optional libraries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: holchan <61059652+holchan@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-01-31 18:03:23 +01:00
hajdul88
f843c256e4
feat: Use unwind for batch edge save and add unit tests for get_graph_from_model
* feat: adds some unit tests for get_graph_from_model

* feat: updates neo4j add_edges cypher and deletes shallow get_graph_from_model

* fix: fixing merge conflict false resolve

* chore: deletes old only_root unit test
2025-01-31 13:14:04 +01:00
alekszievr
a79f7133fd
Feat: add number of tokens and descriptive graph metrics to metric table [COG-1132] (#481)
* Count the number of tokens in documents

* save token count to relational db

* Add metrics to metric table

* Store list as json instead of array in relational db table

* Sum in sql instead of python

* Unify naming

* Return data_points in descriptive metric calculation task

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-01-30 12:39:14 +01:00
alekszievr
edae2771a5
Count the number of tokens in documents [COG-1071] (#476)
* Count the number of tokens in documents

* save token count to relational db

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-01-29 11:29:09 +01:00
Igor Ilic
d900060e2b
Merge pull request #468 from topoteretes/COG-970-refactor-tokenizing
Cog 970 refactor tokenizing
2025-01-29 09:02:23 +01:00
Igor Ilic
860218632f refactor: add suggestions from PR
Add suggestsions made by CodeRabbit on pull request
2025-01-28 17:15:25 +01:00
Igor Ilic
a8644e0bd7 feat: Use litellm max token size as default for model, if model exists in litellm 2025-01-28 17:00:47 +01:00
Igor Ilic
710ca78d6e
Merge branch 'dev' into COG-970-refactor-tokenizing 2025-01-28 16:31:11 +01:00
alekszievr
98f0f60980
Feat: [cog-1089] Define pydantic models for descriptive graph metrics and input metrics (#466)
* feat: make tasks a configurable argument in the cognify function

* fix: add data points task

* Define pydantic models for descriptive graph metrics and input metrics

* remove to_json method

* Use just one MetricData class instead of two

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
2025-01-28 16:11:31 +01:00
Igor Ilic
6f8cbdbf1c
Merge branch 'dev' into COG-970-refactor-tokenizing 2025-01-28 15:44:57 +01:00
Igor Ilic
3e29c3d8f2 docs: Update notebook to work with changes to max chunk tokens 2025-01-28 15:38:38 +01:00
Igor Ilic
4e56cd64a1 refactor: Add max chunk tokens to code graph pipeline 2025-01-28 15:33:34 +01:00
Igor Ilic
dc0450d30e test: Update document tests regrading max chunk tokens 2025-01-28 15:21:43 +01:00
Igor Ilic
e0b7be7cf0 Merge branch 'COG-970-refactor-tokenizing' of github.com:topoteretes/cognee into COG-970-refactor-tokenizing 2025-01-28 14:48:40 +01:00
Igor Ilic
41544369af test: Change test_by_paragraph tests to accomodate to change 2025-01-28 14:47:17 +01:00
Vasilije
8a50da8ff5
Merge pull request #475 from topoteretes/feat/COG-1060-code-pipeline-endpoints
feat: add codegraph related API endpoints
2025-01-28 14:46:52 +01:00
Igor Ilic
b6e21eadda
Merge branch 'dev' into COG-970-refactor-tokenizing 2025-01-28 14:33:14 +01:00
Igor Ilic
3db7f85c9c feat: Add max_chunk_tokens value to chunkers
Add formula and forwarding of max_chunk_tokens value through Cognee
2025-01-28 14:32:00 +01:00
alekszievr
5e076689ad
Feat: [COG-1074] fix multimetric eval bug (#463)
* feat: make tasks a configurable argument in the cognify function

* fix: add data points task

* Ugly hack for multi-metric eval bug

* some cleanup

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
2025-01-28 13:05:22 +01:00
Igor Ilic
49f60971bb Merge branch 'dev' into COG-970-refactor-tokenizing 2025-01-28 10:12:55 +01:00
Boris Arzentar
f811ab44e0 Merge remote-tracking branch 'origin/dev' into feat/COG-1060-code-pipeline-endpoints 2025-01-28 10:10:38 +01:00
Igor Ilic
0a9f1349f2 refactor: Change variable and function names based on PR comments
Change variable and function names based on PR comments
2025-01-28 10:10:29 +01:00
Boris Arzentar
3320bc8f2c feat: add codegraph related API endpoints 2025-01-28 10:08:59 +01:00
Igor Ilic
d8bde5461a
Merge pull request #459 from topoteretes/pgvector-add-normalization
feat: Add normalization to PGVector search
2025-01-27 17:15:10 +01:00
Vasilije
f2c1875d5a
Update README.md 2025-01-27 14:49:35 +01:00
Vasilije
34262a17dd
Update README.md 2025-01-27 14:49:02 +01:00
Hande
0fb19ca21d
docs: update readme with "How Cognee Solves Real-World Pain Points" 2025-01-27 11:36:11 +01:00
Boris
8da81c1de3
Merge branch 'dev' into pgvector-add-normalization 2025-01-27 11:31:24 +01:00
Hande
bd4980c2e1
Merge pull request #464 from topoteretes/cog-1069-update-notebooks-evals
Cog 1069 update notebooks evals
2025-01-27 08:49:24 +01:00
vasilije
3e2ac3b331 fix modal 2025-01-26 17:40:15 +01:00
Boris
0c2c5870df
fix: use low_lever server for cognee mcp server (#470)
* fix: revert to older mcp version

* fix: use low_level server for the mcp

* fix: styling errors

* fix: mcp cognify arguments

* fix: ruff errors
2025-01-26 12:52:48 +01:00
alekszievr
f4b45761ce
Merge branch 'dev' into cog-1069-update-notebooks-evals 2025-01-25 17:32:01 +01:00
Vasilije
a2998b70dd
Update README.md 2025-01-25 12:12:05 +01:00
Vasilije
459e93e8ee
Update README.md 2025-01-25 12:11:04 +01:00
Vasilije
05894e7af0
Update README.md 2025-01-25 12:09:53 +01:00
Igor Ilic
39df73b811
Merge branch 'dev' into cog-1069-update-notebooks-evals 2025-01-24 19:31:44 +01:00
Igor Ilic
89d4b7a5c4
Merge branch 'dev' into pgvector-add-normalization 2025-01-24 19:24:39 +01:00
Igor Ilic
23ecf245ed fix: Return string conversion to resolve traceback 2025-01-24 19:20:55 +01:00