Commit graph

929 commits

Author SHA1 Message Date
hajdul88
e3f3d49a3b
Feature/cog 1312 integrating evaluation framework into dreamify (#562)
<!-- .github/pull_request_template.md -->

## Description
This PR contains eval framework changes due to the autooptimizer
integration

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
  - Enhanced answer generation now returns structured answer details.
  - Search functionality accepts configurable prompt inputs.
  - Option to generate a metrics dashboard from evaluations.
- Corpus building tasks now support adjustable chunk settings for
greater flexibility.
- New task retrieval functionality allows for flexible task
configuration.
  - Introduced new methods for creating and managing metrics dashboards.

- **Refactor/Chore**
- Streamlined API signatures and reorganized module interfaces for
better consistency.
  - Updated import paths to reflect new module structure.

- **Tests**
- Updated test scenarios to align with new configurations and parameter
adjustments.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-03 19:55:47 +01:00
alekszievr
6d7a68dbba
Feat: Store descriptive metrics identified by pipeline run id [cog-1260] (#582)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new analytic capability that calculates descriptive graph
metrics for pipeline runs when enabled.
- Updated the execution flow to include an option for activating the
graph metrics step.

- **Chores**
- Removed the previous mechanism for storing descriptive metrics to
streamline the system.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-03-03 19:09:35 +01:00
Igor Ilic
9305f43d8e
Revert "feat: Change Cognee data models to work with Gemini [COG-1352]" (#596)
Reverts topoteretes/cognee#594

DCO Affirmation

I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced AI responses now deliver structured JSON output with clearly
defined sections, improving clarity and consistency.
- Standardized knowledge graph definitions provide a uniform
representation, simplifying integration and interpretation.



<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-03 17:52:51 +01:00
Igor Ilic
195685a44f
feat: Change Cognee data models to work with Gemini [COG-1352] (#594)
<!-- .github/pull_request_template.md -->

## Description
Change data models and Gemini adapter so it can run custom ontologies

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Improved AI response handling now provides more direct and reliable
output.
- Enhanced knowledge graph displays now include additional descriptive
details under advanced configurations.

- **Refactor**
- Streamlined processing logic reduces complexity and improves
consistency.
- Updated data structures now adapt automatically based on your AI
service configuration for a smoother experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-03 16:20:23 +01:00
lxobr
bee04cad86
Feat/cog 1331 modal run eval (#576)
<!-- .github/pull_request_template.md -->

## Description
- Split metrics dashboard into two modules: calculator (statistics) and
generator (visualization)
- Added aggregate metrics as a new phase in evaluation pipeline
- Created modal example to run multiple evaluations in parallel and
collect results into a single combined output
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced metrics reporting with improved visualizations, including
histogram and confidence interval plots.
- Introduced an asynchronous evaluation process that supports parallel
execution and streamlined result aggregation.
- Added new configuration options to control metrics calculation and
aggregated output storage.

- **Refactor**
- Restructured dashboard generation and evaluation workflows into a more
modular, maintainable design.
- Improved error handling and logging for better feedback during
evaluation processes.

- **Bug Fixes**
- Updated test cases to ensure accurate validation of the new dashboard
generation and metrics calculation functionalities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-03 14:22:32 +01:00
Hande
8874ddad2e
feat: cog-1320 Minimal LLM-Based Entity Extraction (#590)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced an expert entity extraction feature that extracts
significant named entities from text and provides structured output with
essential details.
- Rolled out customizable prompt templates for both system instructions
and user input to standardize the extraction process.
- Integrated a robust language model–based extractor with comprehensive
error handling to ensure reliable and consistent results.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
2025-03-03 13:22:29 +01:00
lxobr
ca2cbfab91
feat: add direct llm eval adapter (#591)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
• Created DirectLLMEvalAdapter - a lightweight alternative to DeepEval
for answer evaluation
• Added evaluation prompt files defining scoring criteria and format
• Made adapter selectable via evaluation_engine = "DirectLLM" in config,
supports "correctness" metric only
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new evaluation method that compares model responses
against a reference answer using structured prompt templates. This
approach enables automated scoring (ranging from 0 to 1) along with
brief justifications.
  
- **Enhancements**
- Updated the configuration to clearly distinguish between evaluation
options, providing end-users with a more transparent and reliable
assessment process.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-01 19:50:20 +01:00
Vasilije
c496bb485c
feat: Draft ollama test (#566)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Tests**
- Introduced new automated testing workflows for Ollama and Gemini,
triggered by pull requests and manual dispatch.
- The Ollama workflow sets up the service and executes a simple example
test to enhance continuous integration.
- Enhanced dependency update workflow with new triggers for push and
pull request events, and added an optional debug logging parameter.
- Added new capabilities for audio and image transcription within the
Ollama API adapter.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Daniel Molnar <soobrosa@gmail.com>
2025-02-28 20:15:12 +01:00
lxobr
3d4312577e
fix: Use DataPoint instead of ExtendableDataPoint in get_all_subclasses (#588)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Use DataPoint instead of ExtendableDataPoint when calling
get_all_subclasses in the get_triplets function of the
GraphCompletionRetriever
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Updated the internal data handling for retrieving information,
ensuring a more consistent and reliable output for end-users.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-27 19:05:09 +01:00
Daniel Molnar
d27f847753
Transition to new retrievers, update searches (#585)
<!-- .github/pull_request_template.md -->

## Description
Delete legacy search implementations after migrating to new retriever
classes

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced search and retrieval capabilities, providing improved context
resolution for code queries, completions, summaries, and graph
connections.
  
- **Refactor**
- Shifted to a modular, object-oriented approach that consolidates query
logic and streamlines error management for a more robust and scalable
experience.
  
- **Bug Fixes**
- Improved error handling for unsupported search types and retrieval
operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-27 15:25:24 +01:00
alekszievr
4c3c811c1e
test: eval_framework/evaluation unit tests [cog-1234] (#575)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Tests**
- Added a suite of tests to validate evaluation logic under various
scenarios, including handling of valid inputs and error conditions.
- Introduced comprehensive tests verifying the accuracy of evaluation
metrics, ensuring reliable scoring and error management.
- Created a new test suite for the `DeepEvalAdapter`, covering
correctness, unsupported metrics, and error handling.
- Added unit tests for `ExactMatchMetric` and `F1ScoreMetric`,
parameterized for various test cases.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-27 13:24:47 +01:00
lxobr
9cc357ac1c
Feat/cog 1365 unify retrievers (#572)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Created the `BaseRetriever` class to unify all the retrievers and
searches.
- Implemented seven specialized retrievers (summaries, chunks,
completions, graph, graph-summary, insights, code) with consistent
get_context/get_completion interfaces.
- Added json context dumping feature in the current completion
implementations to enable context comparisons.
- Built a comparison framework to validate old vs new implementations.
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced multiple retrieval classes for enhanced search
capabilities, including `BaseRetriever`, `ChunksRetriever`,
`CodeRetriever`, `CompletionRetriever`, `GraphCompletionRetriever`,
`GraphSummaryCompletionRetriever`, `InsightsRetriever`, and
`SummariesRetriever`.
- Enhanced query completions with optional context saving for improved
data persistence.
- Implemented advanced tools to compare retrieval outcomes across
different implementations.

- **Refactor**
- Streamlined internal module organization and updated references for
increased maintainability and consistency.
- Added comments indicating future maintenance tasks related to code
merging.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-02-27 12:13:21 +01:00
Boris
711ae8e675
feat: codegraph improvements and new CODE search [COG-1351] (#581)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced an automated deployment workflow to build and push
container images.
	- Updated dependency management to include additional database support.
- **Refactor**
- Enhanced asynchronous operations and logging in the server for
improved performance.
	- Optimized extraction and retrieval processes for code-related data.
- **Chores**
- Streamlined build configurations and startup scripts for greater
reliability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: Igor Ilic <igorilic03@gmail.com>
2025-02-26 20:15:02 +01:00
alekszievr
f6ced4122a
Test: test eval dashboard generation [COG-1234] (#570)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Tests**
- Introduced a new test suite for validating the metrics dashboard
generation.
- Added tests for the `bootstrap_ci` function to ensure accurate
calculations and handling of various input scenarios.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-26 12:45:34 +01:00
Vasilije
4b777cf214
feat: add validation to llm env variables (#558)
… needed

<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Implemented enhanced configuration validation for environment-based
settings. Now, if any configuration parameter is provided via the
environment, all required settings must be present. This improvement
helps catch misconfigurations early, reducing potential errors and
ensuring a smoother, more reliable user experience. These proactive
measures significantly enhance overall system stability and performance.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-26 06:44:45 +01:00
lxobr
1cb83312fe
feat: add experimental cognify pipeline [COG-1293] (#541)
<!-- .github/pull_request_template.md -->

## Description
- Integrate experimental tasks into the evaluation framework
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced interactive prompt templates for extracting graph nodes,
edge triplets, and relationship names, resulting in more comprehensive
and accurate knowledge graphs.
- Added asynchronous processes to efficiently handle document data and
integrate graph components.
- Launched cascade graph task options to offer enhanced flexibility in
task management workflows.
- Added new functionality for extracting content nodes and relationship
names from text.

- **Refactor**
- Streamlined configurations for prompt processing and task
initialization, improving overall modularity and system stability.
- Updated task getter mechanisms to utilize function-based approaches
for improved flexibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-02-25 16:14:27 +01:00
lxobr
55411ff44b
feat: entity completion skeleton [COG-1318] (#552)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Modular implementation of entity completion search
- Added base classes that define entity extractors and context providers
- Created dummy implementations that return test data
- Set up adapters that let us switch between different entity extractors
and context providers using strings
- Added configuration class to control which implementations to use
- Entity completion: query → find entities → get context → interact with
LLM → return answer
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced the query completion experience with integrated language
model response generation, improved validation, and robust error
handling.
- Introduced sample modules for context retrieval and entity extraction
that simulate key processing steps.
- Established foundational abstractions to support flexible context and
entity handling strategies.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-02-25 16:07:48 +01:00
alekszievr
a788875117
test: answer generation [COG-1234] (#569)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Tests**
- Introduced a new asynchronous test to validate the answer generation
functionality, ensuring that generated responses align with the provided
question-answer pairs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-25 12:21:36 +01:00
Igor Ilic
4f354ba534
fix: reuse PostgreSQL database connections (#574)
<!-- .github/pull_request_template.md -->

## Description
Fix PostgreSQL database connection problems

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Improved the system’s database connection process to enhance
compatibility across multiple relational databases. The application now
dynamically selects the optimal connection method—reusing established
connections when possible—to ensure improved stability and performance
without affecting the public interface.
- Streamlined the creation of the embedding engine by removing it as a
parameter and generating it internally.
- Removed dependency on the embedding engine in the vector engine
retrieval process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-24 20:35:40 +01:00
alekszievr
a61df966c6
feat: use external chunker [cog-1354] (#551)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a modular content chunking interface that offers flexible
text segmentation with configurable chunk size and overlap.
- Added new chunkers for enhanced text processing, including
`LangchainChunker` and improved `TextChunker`.

- **Refactor**
- Unified the chunk extraction mechanism across various document types
for improved consistency and type safety.
- Updated method signatures to enhance clarity and type safety regarding
chunker usage.
- Enhanced error handling and logging during text segmentation to guide
adjustments when content exceeds limits.

- **Bug Fixes**
- Adjusted expected output in tests to reflect changes in chunking logic
and configurations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-21 14:10:59 +01:00
hajdul88
eba1515127
feat: quick fix dynamic collection handling in search (#567) [COG-1369]
<!-- .github/pull_request_template.md -->

## Description
Fixes search dynamic collection mapping in graph completion search

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Adjusted graph processing to remove extraneous notifications when
expected data elements are absent.
- Updated query processing to ensure a more consistent selection of
related data types.
- Streamlined database error handling by aligning exception management
with standard practices.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-21 13:45:42 +01:00
alekszievr
28f92f661e
Test: Mock file download and open in musique adapter (#571)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Tests**
- Enhanced test coverage to improve adapter instantiation and data
loading reliability.
  - Updated mock testing logic to ensure robust content handling.
  - Removed an outdated test focused on data limit validation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-20 16:11:19 +01:00
alekszievr
97db017708
Test: test corpus builder [cog-1234] (#564)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Enhanced the continuous integration workflows with updated dependency
management and environment configurations for improved test stability.
  
- **Tests**
- Added parameterized unit tests to verify corpus loading and structure,
ensuring more robust handling of test data.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-20 15:16:58 +01:00
alekszievr
17231de5d0
Test: Parse context pieces separately in MusiqueQAAdapter and adjust tests [cog-1234] (#561)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Tests**
- Updated evaluation checks by removing assertions related to the
relationship between `corpus_list` and `qa_pairs`, now focusing solely
on `qa_pairs` limits.

- **Refactor**
- Improved content processing to append each paragraph individually to
`corpus_list`, enhancing clarity in data structure.
- Simplified type annotations in the `load_corpus` method across
multiple adapters, ensuring consistency in return types.

- **Chores**
- Updated dependency installation commands in GitHub Actions workflows
for Python 3.10, 3.11, and 3.12 to include additional evaluation-related
dependencies.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-02-20 14:23:53 +01:00
lxobr
e25c7c93fe
fix: correctly add nodes to chunks [COG-1370] (#568)
<!-- .github/pull_request_template.md -->

## Description
- Fix expand_with_nodes_and_edges to correctly add nodes to chunks
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Enhanced the internal processing for data associations to ensure more
reliable and consistent handling of connections.
- Streamlined the logic to better manage edge cases, improving overall
stability and error handling.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-20 12:52:34 +01:00
Igor Ilic
f2e0f47565
fix: test llm connection with gemini (#557)
<!-- .github/pull_request_template.md -->

## Description
Temporary fix for Gemini LLM until they allow empty dictionaries in
model schema definition

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- AI responses now adjust their format dynamically based on the type of
output, providing a streamlined text display when appropriate.
- Extended processing time improves the handling of longer operations
for a more reliable interaction.

- **Bug Fixes**
- Enhanced error management during connectivity tests ensures a more
robust and stable user experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-20 11:41:29 +01:00
Boris
45f7c63322
fix: notebooks errors (#565)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Automatically creates a blank graph when a file isn’t found, ensuring
smoother operations.
- Updated demonstration notebooks with dynamic configurations, including
refined search operations and input prompts.
- Introduced optional support for additional graph functionalities via
an integrated dependency.

- **Refactor**
- Streamlined processing by eliminating duplicate steps and simplifying
graph rendering workflows.

- **Chores**
- Updated environment configurations and upgraded the Python runtime for
improved performance and consistency.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-19 14:07:11 -08:00
Boris
ada466879e
fix: add default params to run_tasks (#563)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced the task execution process by enabling default values for
certain parameters, allowing users to trigger task processing without
supplying every input explicitly.
  
- **Bug Fixes**
- Adjusted asynchronous handling for the `retrieved_edges_to_string`
function to ensure proper execution flow in various components.

- **Documentation**
- Updated markdown formatting in the Jupyter notebook for improved
readability and structure.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-02-19 20:18:51 +01:00
alekszievr
e56d86b410
feat: Implement optional neo4j metrics and improve tests [cog-1262] (#556)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced graph analytics now offer detailed metrics—including shortest
path lengths, diameter, and clustering coefficients—to provide deeper
insights.
- Added new functions for creating connected test graphs and validating
metrics against predefined ground truth values.
- Introduced a new JSON file containing metrics for connected and
disconnected graph structures.

- **Improvements**
- Updated how graphs are projected to consistently use undirected
representations, ensuring more accurate and reliable metric
calculations.
- Streamlined metric consistency checks across different graph
processing methods for robust, reliable results.
- Simplified testing logic by consolidating metric assertions into a
single function call.

- **Chores**
- Removed unnecessary secret variables from the workflow configuration,
potentially affecting access to certain resources.
	- Updated secret management to include the new `OPENAI_API_KEY`.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-19 16:24:59 +01:00
alekszievr
2a167fa1ab
feat: externalize chunkers [cog-1354] (#547)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced document chunk extraction for improved processing consistency
across multiple formats.

- **Refactor**
- Streamlined the configuration for text chunking by replacing indirect
mappings with a direct instantiation approach across document types.
- Updated method signatures across various document classes to accept
chunker class references instead of string identifiers.

- **Chores**
- Removed legacy configuration utilities related to document chunking to
simplify processing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-19 13:26:11 +01:00
hajdul88
0bcaf5c477
Feature/cog 1358 local ollama model support for cognee (#555)
<!-- .github/pull_request_template.md -->

This PR contains the ollama specific llm adapter together with the
embedding engine.

Tested with the following models:

`LLM_API_KEY="ollama"
llm_model = "llama3.1:8b"
LLM_PROVIDER = "ollama"
llm_endpoint = "http://localhost:11434/v1"
EMBEDDING_PROVIDER="ollama"
EMBEDDING_MODEL="avr/sfr-embedding-mistral:latest"
EMBEDDING_ENDPOINT="http://localhost:11434/api/embeddings"
EMBEDDING_DIMENSIONS=4096
HUGGINGFACE_TOKENIZER="Salesforce/SFR-Embedding-Mistral"`

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new embedding option that leverages an external provider
for asynchronous text processing.
- Added enhanced language model integration using a dedicated adapter to
improve interaction quality.

- **Enhancements**
  - Expanded configuration settings to include a new tokenizer option.
- Updated provider selection logic to incorporate the additional
embedding and language model features.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: vasilije <vas.markovic@gmail.com>
2025-02-19 02:54:04 +01:00
alekszievr
4efdb29187
Summarize retrieved edges to compact string [COG-1181] (#522)
<!-- .github/pull_request_template.md -->

## Description
Summarize retrieved edges to compact string with no redundancies.
Example:
**Before summarization:**


CV example:

visual innovations -- employs -- visual innovations
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- creativeworks agency
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- visual innovations
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- rhode island school of design
---
Experienced Graphic Designer with over 8 years in visual design and
branding, specializing in Adobe Creative Suite and enthusiastic about
producing engaging visuals. -- made_from --
CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography

**After summarization:**

David Thompson is a Creative Graphic Designer with over 8 years of
experience in visual design and branding, proficient in Adobe Creative
Suite and passionate about creating compelling visuals. He holds a
B.F.A. in Graphic Design from the Rhode Island School of Design (2012).
His experience includes working as a Senior Graphic Designer at
CreativeWorks Agency (2015 – Present), where he led design projects and
created branding materials that increased client engagement by 30%, and
as a Graphic Designer at Visual Innovations (2012 – 2015), where he
designed marketing collateral and collaborated with the marketing team
to develop cohesive brand strategies. His skills include design software
such as Adobe Photoshop, Illustrator, and InDesign, as well as web
design in HTML and CSS, with specialties in Branding and Identity and
Typography.

1. David Thompson employs his skills in visual design and branding.
2. David Thompson contains experience from CreativeWorks Agency.
3. David Thompson contains experience from Visual Innovations.
4. David Thompson made his qualifications from the Rhode Island School
of Design.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a summarization engine that converts relationship-based
inputs into concise, natural sentences.
- Expanded search capabilities with a new query option that generates
graph summaries, providing insightful and aggregated results from graph
data.
- Enhanced asynchronous processing for improved performance in handling
graph data queries and summarization.
- Added flexibility in specifying string conversion methods for graph
edge retrieval.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-18 17:29:55 +01:00
SJ
d05b49863c
Creation of default user to have is_superuser=True by default (#539)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

God mode turned on by default for the default user creation.
is_superuser=True in create_default_user.py

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- The default user is now created with elevated (superuser) privileges,
which may affect access control and permissions.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-15 03:09:40 +01:00
Igor Ilic
ea88beb687
feat: Force .env file settings over real environment variable values [COG-1333] (#537)
<!-- .github/pull_request_template.md -->

## Description
Force .env file settings over real environment variable values

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Refined configuration handling so that settings from your
configuration file reliably take precedence over existing values,
ensuring a smoother and more predictable experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-14 20:40:21 +01:00
Igor Ilic
46e026f77f
Cognee gui [COG-1307] (#530)
<!-- .github/pull_request_template.md -->

## Description
Add a simple GUI to add documents to Cognee and use GRAPH_COMPLETION
search to get answers

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced an interactive file search interface with intuitive
controls. Users can easily upload files, enter search terms, and view
results in a unified display with clear notifications during processing.
  
- **Chores**
- Updated project dependencies to include `pyside6` and `qasync` for
enhanced GUI functionality.
- Refined background query processing to improve the accuracy and
relevance of search outcomes.
- Improved code readability with formatting enhancements in the search
function.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-14 15:51:33 +01:00
SJ
a602094598
feat: Update parameters in search API route to match search function parameters order (#528)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
**Updated handling of SearchType through the chain:**
Router receives JSON with searchType Enum, example: "searchType":
"CHUNKS"
FastAPI converts to SearchType enum via SearchPayloadDTO
search_v2.py expects SearchType enum
search.py takes SearchType enum and extracts value
log_query.py takes string value
Query model stores string in database

**get_search_router.py**

Matched the exact field name from JSON payload searchType instead of
search_type in the SearchPayloadDTO class.
Changed cognee_search() params to use payload.query and
payload.searchType

**search.py**
Changed query_type to SearchType
log_query to accept query_type.value parameter instead of
str(query_type)

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Updated the search functionality to improve consistency and
reliability.
- Enhanced validation by switching to stricter search type checks,
ensuring only valid search types are processed.
- Maintained robust error handling for uninterrupted search operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-02-13 21:04:31 +01:00
Boris Arzentar
20cbcbf52b fix: ruff error 2025-02-13 14:41:11 +01:00
Boris Arzentar
d0d8559453 fix: consolidate api/sdk/mcp search 2025-02-13 13:15:39 +01:00
Boris
f9e6dcf837
fix: simplify code pipeline (#529)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
  - Enhanced code search and dependency analysis for improved accuracy.
  - Introduced a new high-performance text embedding option.
  - Added an additional execution entry point for code graph processing.
- New optional parameters for flexible property selection in retrieval
functions.
- Introduced new classes for handling import statements, function
definitions, and class definitions.
  - Updated embedding engine selection based on configuration options.

- **Bug Fixes**
- Improved error handling in search operations and database queries for
a more stable user experience.
  - Enhanced error logging for source code parsing.

- **Refactor**
- Streamlined asynchronous processing and refactored internal dependency
extraction.
- Updated configuration and integration settings to enhance overall
reliability.
  - Restructured functions for simplified dependency handling.

- **Chores**
- Upgraded and reorganized dependency management with optional libraries
for extended functionality.
- Added new secret parameters for embedding configuration in workflow
settings.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: vasilije <vas.markovic@gmail.com>
2025-02-12 23:58:48 +01:00
Vasilije
9ba2e0d6c1
chore: Fix and update visualization (#518)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced enhanced visualization capabilities that let users launch a
dedicated server for visual displays.
  
- **Documentation**
- Updated several interactive notebooks to include execution outputs and
expanded explanatory content for better user guidance.
  
- **Style**
- Refined formatting and layout across notebooks to ensure consistent
presentation and improved readability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-02-11 19:25:01 +01:00
hajdul88
1b630366c9
Adds types property to pydantic Datapoint inherited classes (#523)
<!-- .github/pull_request_template.md -->

## Description
This PR adds types to DataPoint pydantic class + fixes visualization
colors

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added a `type` field to the `DataPoint` model for clearer data
classification.
- Enhanced color mapping in visualizations by assigning a distinct color
to "TextSummary" nodes.

- **Refactor**
- Improved default settings for version control and ordering to ensure
consistent data behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-11 19:23:19 +01:00
Igor Ilic
ebd1d2adbf
fix: Resolve issue with UUID in telemetry (#524)
<!-- .github/pull_request_template.md -->

## Description
Fixes sending of UUID through telemetry

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Enhanced telemetry logging by ensuring identifiers are consistently
formatted. This improvement helps prevent type-related issues during
logging and boosts overall reliability without affecting task execution
or user-facing functionality.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-11 18:31:53 +01:00
alekszievr
05ba29af01
Feat: log pipeline status and pass it through pipeline [COG-1214] (#501)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced pipeline execution now provides consolidated status feedback
with improved telemetry for start, completion, and error events.
- Automatic generation of unique dataset identifiers offers clearer task
and pipeline run associations.

- **Refactor**
- Task execution has been streamlined with explicit parameter handling
for more structured pipeline processing.
- Interactive examples and demos now return results directly, making
integration and monitoring more accessible.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2025-02-11 16:41:40 +01:00
hajdul88
6a0c0e3ef8
feat: Cognee evaluation framework development (#498)
<!-- .github/pull_request_template.md -->

This PR contains the evaluation framework development for cognee

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Expanded evaluation framework now integrates asynchronous corpus
building, question answering, and performance evaluation with adaptive
benchmarks for improved metrics (correctness, exact match, and F1
score).

- **Infrastructure**
- Added database integration for persistent storage of questions,
answers, and metrics.
- Launched an interactive metrics dashboard featuring advanced
visualizations.
- Introduced an automated testing workflow for continuous quality
assurance.

- **Documentation**
  - Updated guidelines for generating concise, clear answers.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-11 16:31:54 +01:00
Boris
8f84713b54
fix: support structured data conversion to data points (#512)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- New Features
- Introduced version tracking and enhanced metadata in core data models
for improved data consistency.
  
- Bug Fixes
- Improved error handling during graph data loading to prevent
disruptions from unexpected identifier formats.
  
- Refactor
- Centralized identifier parsing and streamlined model definitions,
ensuring smoother and more consistent operations across search,
retrieval, and indexing workflows.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-10 17:16:13 +01:00
Igor Ilic
f3e296b171
chore: Ruff format fix (#516)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Style**
- Made minor internal formatting adjustments to improve consistency.
These changes do not affect any visible functionality for end-users.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-10 06:40:16 -05:00
Igor Ilic
591576b424
fix: resolve python 3.10 issue with mocking [COG-1274] (#517)
<!-- .github/pull_request_template.md -->

## Description
Resolve issue with patch failing in python3.10

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Tests**
- Refined internal test setups to enhance clarity and streamline
dependency injection.
- Improved organization of test cases to ensure robust verification of
behaviors and error handling.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-10 12:34:25 +01:00
Boris Arzentar
95ee62ad79 fix: extract index and field name from indexable data points 2025-02-08 15:21:59 +01:00
Boris
f75e35c337
fix: custom model pipeline (#508)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
• Graph visualizations now allow exporting to a user-specified file path
for more flexible output management.
• The text embedding process has been enhanced with an additional
tokenizer option for improved performance.
• A new `ExtendableDataPoint` class has been introduced for future
extensions.
• New JSON files for companies and individuals have been added to
facilitate testing and data processing.

- **Improvements**
• Search functionality now uses updated identifiers for more reliable
content retrieval.
• Metadata handling has been streamlined across various classes by
removing unnecessary type specifications.
• Enhanced serialization of properties in the Neo4j adapter for improved
handling of complex structures.
• The setup process for databases has been improved with a new
asynchronous setup function.

- **Chores**
• Dependency and configuration updates improve overall stability and
performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-08 02:00:15 +01:00
Igor Ilic
6be6b3d222
fix: resolve key error for visualization, handle ValueError for jedi … (#509)
…and move duplicate edge information to debug log

<!-- .github/pull_request_template.md -->

## Description
Fix visualization bug 
Handle ValueError for CodeGraph
Move debug information from print to debug logs

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
- Enhanced error handling across several modules to ensure smoother
operation when unexpected conditions occur.
- Updated diagnostic and logging mechanisms to provide more robust
system feedback and reduce potential disruptions.
- Improved robustness in the deletion of properties to prevent runtime
errors related to missing keys.
- Added additional exception handling for better analysis of code
entities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-08 00:46:46 +01:00