<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Documentation**
- Improved grammatical correctness and enhanced clarity in the project
documentation.
- Refined descriptions for capabilities, user instructions, and
community engagement.
- Introduced a note about a starter repository to help users get up and
running quickly.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
• Graph visualizations now allow exporting to a user-specified file path
for more flexible output management.
• The text embedding process has been enhanced with an additional
tokenizer option for improved performance.
• A new `ExtendableDataPoint` class has been introduced for future
extensions.
• New JSON files for companies and individuals have been added to
facilitate testing and data processing.
- **Improvements**
• Search functionality now uses updated identifiers for more reliable
content retrieval.
• Metadata handling has been streamlined across various classes by
removing unnecessary type specifications.
• Enhanced serialization of properties in the Neo4j adapter for improved
handling of complex structures.
• The setup process for databases has been improved with a new
asynchronous setup function.
- **Chores**
• Dependency and configuration updates improve overall stability and
performance.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…and move duplicate edge information to debug log
<!-- .github/pull_request_template.md -->
## Description
Fix visualization bug
Handle ValueError for CodeGraph
Move debug information from print to debug logs
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Bug Fixes**
- Enhanced error handling across several modules to ensure smoother
operation when unexpected conditions occur.
- Updated diagnostic and logging mechanisms to provide more robust
system feedback and reduce potential disruptions.
- Improved robustness in the deletion of properties to prevent runtime
errors related to missing keys.
- Added additional exception handling for better analysis of code
entities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…nnected graph
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- Bug Fixes
- Enhanced reliability of graph metric calculations to gracefully handle
unexpected inputs, ensuring smoother and uninterrupted graph analysis
for end-users.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Refactor**
- Updated the default processing flow by removing a descriptive metrics
task.
- **New Features**
- Introduced asynchronous graph management capabilities including
checks, projection, and deletion.
- Enhanced graph metrics extraction with additional analytics.
- **Chores**
- Improved timestamp handling using database-driven defaults.
- **Tests**
- Added tests to verify graph metrics consistency and accuracy.
- Integrated a new CI workflow for automated testing of graph metrics.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Documentation**
- Updated README formatting for improved consistency.
- Adjusted image display size from 50% to 80% to enhance visual
presentation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
Refactor search so query type doesn't need to be provided to make it
simpler for new users
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Refactor**
- Improved the search interface by standardizing parameter usage with
explicit keyword arguments for specifying search types, enhancing
clarity and consistency.
- **Tests**
- Updated test cases and example integrations to align with the revised
search parameters, ensuring consistent behavior and reliable validation
of search outcomes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enhanced graph management capabilities allow users to verify graph
existence, project complete graphs, and remove graphs, delivering more
comprehensive graph insights.
- **Refactor**
- Adjusted default task behavior for streamlined performance.
- Updated timestamp handling to ensure accurate and consistent record
tracking.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
<!-- .github/pull_request_template.md -->
## Description
Simplest cognee docker setup of SQLite-NetworkX-LanceDB should not
enable postgres configuration in the .env.template by default. I think
leaving postgres details commented is better than removing them
entirely. This keeps simple optionality visible for newcomers.
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Chores**
- Updated configuration template to remove database parameters not used
in the default setup, with clearer guidance to ensure the intended
configuration is maintained.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
This PR contains the improvement of the visualization endpoint
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Launched an enhanced interactive network visualization utility that
renders dynamic, browser-based graphs. The new feature simplifies
execution by directly generating an HTML file showcasing the
visualization—complete with interactive elements and an on-screen
confirmation—providing a more intuitive and efficient experience.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Chores**
- Updated the CI configuration for integration tests to use revised
secret values, ensuring improved alignment with current external API
credential requirements and deprecating legacy references.
- Made several secrets optional in the workflow, enhancing flexibility
during execution.
- Removed several outdated secrets from multiple workflows, streamlining
the configuration.
- Improved error handling in the code processing logic by adding
exception management for `AttributeError` and `AssertionError`.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
Add test of embedding and LLM model at beginning of cognee use
Fix issue with relational database async use
Refactor handling of cache mechanism for all databases so changes in
config can be reflected in get functions
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Introduced connection testing for language and embedding services at
startup, ensuring improved reliability during data addition.
- **Refactor**
- Streamlined engine initialization across multiple database systems to
enhance performance and clarity.
- Improved parameter handling and caching strategies for faster, more
consistent operations.
- Updated record identifiers for more robust and unique data storage.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: holchan <61059652+holchan@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enhanced search functionality now returns JSON formatted results for
code-based queries for clearer response presentation.
- **Refactor**
- Updated the search logic to differentiate result handling based on
query type, maintaining string output for non-code queries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
Add re-raising of errors in general exception handling
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Bug Fixes & Stability Improvements**
- Enhanced error handling throughout the system, ensuring issues during
operations like server startup, data processing, and graph management
are properly logged and reported.
- **Refactor**
- Standardized logging practices replace basic output statements,
improving traceability and providing better insights for
troubleshooting.
- **New Features**
- Updated search functionality now returns only unique results,
enhancing data consistency and the overall user experience.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: holchan <61059652+holchan@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Dependency Update**
- Downgraded `mcp` package version from 1.2.0 to 1.1.3
- Updated `cognee` dependency to include additional features with
`cognee[codegraph]`
- **New Features**
- Introduced a new tool, "codify", for transforming codebases into
knowledge graphs
- Enhanced the existing "search" tool to accept a new parameter for
search type
- **Improvements**
- Streamlined search functionality with a new modular approach
- Added new asynchronous function for retrieving and formatting code
parts
- **Documentation**
- Updated import paths for `SearchType` in various modules and tests to
reflect structural changes
- **Code Cleanup**
- Removed legacy search module and associated classes/functions
- Refined data transfer object classes for consistency and clarity
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enabled an option to retrieve more detailed metrics, providing
comprehensive analytics for graph and descriptive data.
- **Refactor**
- Standardized the way metrics are obtained across components for
consistent behavior and improved data accuracy.
- **Chore**
- Made internal enhancements to support optional detailed metric
calculations, streamlining system performance and ensuring future
scalability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Dummy implementation of graph metrics to demonstrate how the interface
will look like
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Introduced asynchronous functionality for retrieving comprehensive
graph metrics, including counts and connectivity details, across
different systems.
- **Refactor**
- Streamlined metrics processing and storage by shifting to direct
retrieval from the graph engine.
- Updated naming conventions for the `GraphMetrics` database table and
reorganized module imports to enhance internal consistency.
- **Chores**
- Removed dataset deletion functionalities while introducing the ability
to store descriptive metrics.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
- **Tests**
- Added comprehensive unit tests for graph model generation
- Introduced new test scenarios covering various data structures and
edge cases
- Implemented tests for document, chunk, and entity relationships
- **Chores**
- Updated continuous deployment workflow to trigger only on `dev` branch
The release focuses on improving test coverage and refining the
deployment process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
PR to test Gemini PR from holchan
1. Add Gemini LLM and Gemini Embedding support
2. Fix CodeGraph issue with chunks being bigger than maximum token value
3. Add Tokenizer adapters to CodeGraph
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Added support for the Gemini LLM provider.
- Expanded LLM configuration options.
- Introduced a new GitHub Actions workflow for multimetric QA
evaluation.
- Added new environment variables for LLM and embedding configurations
across various workflows.
- **Bug Fixes**
- Improved error handling in various components.
- Updated tokenization and embedding processes.
- Removed warning related to missing `dict` method in data items.
- **Refactor**
- Simplified token extraction and decoding methods.
- Updated tokenizer interfaces.
- Removed deprecated dependencies.
- Enhanced retry logic and error handling in embedding processes.
- **Documentation**
- Updated configuration comments and settings.
- **Chores**
- Updated GitHub Actions workflows to accommodate new secrets and
environment variables.
- Modified evaluation parameters.
- Adjusted dependency management for optional libraries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: holchan <61059652+holchan@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
* feat: adds some unit tests for get_graph_from_model
* feat: updates neo4j add_edges cypher and deletes shallow get_graph_from_model
* fix: fixing merge conflict false resolve
* chore: deletes old only_root unit test
* Count the number of tokens in documents
* save token count to relational db
* Add metrics to metric table
* Store list as json instead of array in relational db table
* Sum in sql instead of python
* Unify naming
* Return data_points in descriptive metric calculation task
---------
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
* Count the number of tokens in documents
* save token count to relational db
---------
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
* feat: make tasks a configurable argument in the cognify function
* fix: add data points task
* Define pydantic models for descriptive graph metrics and input metrics
* remove to_json method
* Use just one MetricData class instead of two
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
* feat: make tasks a configurable argument in the cognify function
* fix: add data points task
* Ugly hack for multi-metric eval bug
* some cleanup
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>