cognee

Author	SHA1	Message	Date
alekszievr	433264d4e4	feat: Add context evaluation to eval framework [COG-1366] (#586 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a class-based retrieval mechanism to enhance answer generation with improved context extraction and completion. - Added a new evaluation metric for contextual relevancy and an option to enable context evaluation during the evaluation process. - Refactor - Transitioned from a function-based answer resolver to a more modular retriever approach to improve extensibility. - Tests - Updated tests to align with the new answer generation and evaluation process. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Daniel Molnar <soobrosa@gmail.com> Co-authored-by: Boris <boris@topoteretes.com>	2025-03-05 16:40:24 +01:00
lxobr	f033f733b5	feat: entity brute force triplet search [COG-1325] (#589 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> - Refactored `brute_force_triplet_search`, extracting memory projection. - Built TripletSearchContextProvider (extends BaseContextProvider) to create a single memory projection and perform a triplet search for each entity. - Refactored `entity_completion` into EntityCompletionRetriever (extends BaseRetriever). - Added SummarizedTripletSearchContextProvider (extends TripletSearchContextProvider) for an alternative summarized output format. - Developed and tested an example showcasing both context providers, comparing raw triplets, summaries, and standard search results. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced text summarization now delivers clearer, more concise overviews of search results. - Improved search performance with optimized context retrieval and memory reuse for faster, more reliable results. - Introduced advanced entity-based completion for generating more relevant, context-aware responses. - Refactor - Streamlined internal workflows and error handling to ensure a smoother overall experience. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Boris <boris@topoteretes.com>	2025-03-05 11:17:58 +01:00
Daniel Molnar	7bac2303cc	chore: Be explicit on extras to install in Docker (#598 ) <!-- .github/pull_request_template.md --> ## Description Be explicit on extras to install in Docker. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a configurable option to install only selected dependency extras, allowing for a more tailored build experience. - Chores - Improved clarity in the build instructions regarding environment configuration. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-04 17:18:57 +01:00
hajdul88	3e93dbe264	fix: add currying to question_answering_non_parallel (#602 ) …l to avoid additional params <!-- .github/pull_request_template.md --> Introduces lambda currying in question answering non parallel function to avoid unnecessary params ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Streamlined the question-answering process for cleaner, more efficient query handling. - Updated the handling of parameters in the answer generation process, allowing for a more dynamic integration of context. - Simplified test setups by reducing the number of parameters involved in the mock answer resolver. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-04 16:09:53 +01:00
Igor Ilic	cade574bbf	Change data models for gemini (#600 ) <!-- .github/pull_request_template.md --> ## Description Change Gemini adapter and data models so Gemini can use custom data models ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced provider-specific enhancements with updated data representations, including improved node labeling and enriched summary and description fields for graph displays. - Improved configuration management by automatically loading environment settings for better LLM operations. - Refactor - Streamlined response handling with a simplified approach for defining output formats. - Updated error handling by removing the try-except block for dotenv imports. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-04 14:09:28 +01:00
hajdul88	5eef212668	Allowing parallel edges in graph projection when using graph completion search (#599 ) <!-- .github/pull_request_template.md --> ## Description Allows parallell edges in graph projection when using graph completion search ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Streamlined the process for updating connections within the application’s graph. The update now ensures that every connection is consistently recorded and propagated without performing duplicate checks. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-04 12:37:26 +01:00
hajdul88	e3f3d49a3b	Feature/cog 1312 integrating evaluation framework into dreamify (#562 ) <!-- .github/pull_request_template.md --> ## Description This PR contains eval framework changes due to the autooptimizer integration ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced answer generation now returns structured answer details. - Search functionality accepts configurable prompt inputs. - Option to generate a metrics dashboard from evaluations. - Corpus building tasks now support adjustable chunk settings for greater flexibility. - New task retrieval functionality allows for flexible task configuration. - Introduced new methods for creating and managing metrics dashboards. - Refactor/Chore - Streamlined API signatures and reorganized module interfaces for better consistency. - Updated import paths to reflect new module structure. - Tests - Updated test scenarios to align with new configurations and parameter adjustments. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-03 19:55:47 +01:00
Boris Arzentar	933c7c86c2	version: v0.1.32	2025-03-03 19:17:55 +01:00
alekszievr	6d7a68dbba	Feat: Store descriptive metrics identified by pipeline run id [cog-1260] (#582 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a new analytic capability that calculates descriptive graph metrics for pipeline runs when enabled. - Updated the execution flow to include an option for activating the graph metrics step. - Chores - Removed the previous mechanism for storing descriptive metrics to streamline the system. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2025-03-03 19:09:35 +01:00
Boris	10e4bfb6ab	fix: cognee mcp docker [COG-1470] (#595 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Enhanced deployment and build processes to improve system reliability and simplify dependency management. - New Features - Added a new dependency (`uv>=0.6.3`) to support enhanced functionality. - Updated extra dependencies for `codegraph` to include the `transformers` library. - Improved logging on server startup for clearer operational feedback. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2025-03-03 19:04:41 +01:00
Igor Ilic	9305f43d8e	Revert "feat: Change Cognee data models to work with Gemini [COG-1352]" (#596 ) Reverts topoteretes/cognee#594 DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced AI responses now deliver structured JSON output with clearly defined sections, improving clarity and consistency. - Standardized knowledge graph definitions provide a uniform representation, simplifying integration and interpretation. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-03 17:52:51 +01:00
Igor Ilic	195685a44f	feat: Change Cognee data models to work with Gemini [COG-1352] (#594 ) <!-- .github/pull_request_template.md --> ## Description Change data models and Gemini adapter so it can run custom ontologies ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Improved AI response handling now provides more direct and reliable output. - Enhanced knowledge graph displays now include additional descriptive details under advanced configurations. - Refactor - Streamlined processing logic reduces complexity and improves consistency. - Updated data structures now adapt automatically based on your AI service configuration for a smoother experience. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-03 16:20:23 +01:00
lxobr	bee04cad86	Feat/cog 1331 modal run eval (#576 ) <!-- .github/pull_request_template.md --> ## Description - Split metrics dashboard into two modules: calculator (statistics) and generator (visualization) - Added aggregate metrics as a new phase in evaluation pipeline - Created modal example to run multiple evaluations in parallel and collect results into a single combined output ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced metrics reporting with improved visualizations, including histogram and confidence interval plots. - Introduced an asynchronous evaluation process that supports parallel execution and streamlined result aggregation. - Added new configuration options to control metrics calculation and aggregated output storage. - Refactor - Restructured dashboard generation and evaluation workflows into a more modular, maintainable design. - Improved error handling and logging for better feedback during evaluation processes. - Bug Fixes - Updated test cases to ensure accurate validation of the new dashboard generation and metrics calculation functionalities. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-03 14:22:32 +01:00
Hande	8874ddad2e	feat: cog-1320 Minimal LLM-Based Entity Extraction (#590 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced an expert entity extraction feature that extracts significant named entities from text and provides structured output with essential details. - Rolled out customizable prompt templates for both system instructions and user input to standardize the extraction process. - Integrated a robust language model–based extractor with comprehensive error handling to ensure reliable and consistent results. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>	2025-03-03 13:22:29 +01:00
Igor Ilic	2323fd0c94	feat: Add gemini ollama support for cognee-mcp [COG-1408] (#583 ) <!-- .github/pull_request_template.md --> ## Description Add gemini ollama support for cognee mcp ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Expanded the system’s capabilities by updating its underlying integrations, providing enhanced functionality and performance improvements for end-users. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-01 19:51:48 +01:00
lxobr	ca2cbfab91	feat: add direct llm eval adapter (#591 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> • Created DirectLLMEvalAdapter - a lightweight alternative to DeepEval for answer evaluation • Added evaluation prompt files defining scoring criteria and format • Made adapter selectable via evaluation_engine = "DirectLLM" in config, supports "correctness" metric only ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a new evaluation method that compares model responses against a reference answer using structured prompt templates. This approach enables automated scoring (ranging from 0 to 1) along with brief justifications. - Enhancements - Updated the configuration to clearly distinguish between evaluation options, providing end-users with a more transparent and reliable assessment process. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-01 19:50:20 +01:00
Vasilije	c496bb485c	feat: Draft ollama test (#566 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Tests - Introduced new automated testing workflows for Ollama and Gemini, triggered by pull requests and manual dispatch. - The Ollama workflow sets up the service and executes a simple example test to enhance continuous integration. - Enhanced dependency update workflow with new triggers for push and pull request events, and added an optional debug logging parameter. - Added new capabilities for audio and image transcription within the Ollama API adapter. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Daniel Molnar <soobrosa@gmail.com>	2025-02-28 20:15:12 +01:00
lxobr	3d4312577e	fix: Use DataPoint instead of ExtendableDataPoint in get_all_subclasses (#588 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> - Use DataPoint instead of ExtendableDataPoint when calling get_all_subclasses in the get_triplets function of the GraphCompletionRetriever ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Updated the internal data handling for retrieving information, ensuring a more consistent and reliable output for end-users. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 19:05:09 +01:00
Boris Arzentar	653f5e40dd	version: v0.1.31	2025-02-27 18:19:34 +01:00
Boris	e8ab5b4797	fix: tiktoken upgrade (#587 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Removed an outdated internal tracking reference to streamline maintenance. - Upgraded a key dependency to its latest stable release, delivering enhanced performance and reliability for a smoother user experience. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 18:16:11 +01:00
Daniel Molnar	d27f847753	Transition to new retrievers, update searches (#585 ) <!-- .github/pull_request_template.md --> ## Description Delete legacy search implementations after migrating to new retriever classes ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced search and retrieval capabilities, providing improved context resolution for code queries, completions, summaries, and graph connections. - Refactor - Shifted to a modular, object-oriented approach that consolidates query logic and streamlines error management for a more robust and scalable experience. - Bug Fixes - Improved error handling for unsupported search types and retrieval operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 15:25:24 +01:00
Igor Ilic	f9b6630024	chore: Add ollama optional depdendency (#584 ) <!-- .github/pull_request_template.md --> ## Description Add ollama optional dependency ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Updated the project’s dependency configuration to include an additional optional package for enhanced transformation functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 15:09:58 +01:00
lxobr	4b7c21d7d8	feat: retrieve golden contexts [COG-1364] (#579 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> • Added load_golden_context parameter to BaseBenchmarkAdapter's abstract load_corpus method, establishing a common interface for retrieving supporting evidence • Refactored HotpotQAAdapter with a modular design: introduced _get_metadata_field_name method to handle dataset-specific fields (making it extensible for child classes), implemented get golden context functionality. • Refactored TwoWikiMultihopAdapter to inherit from HotpotQAAdapter, overriding only the necessary methods while reusing parent's functionality • Added golden context support to MusiqueQAAdapter with their decomposition-based format ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced an option to include additional context during corpus loading, enhancing the quality and flexibility of generated QA pairs. - Refactor - Streamlined and modularized the processing workflow across different adapters for improved consistency and maintainability. - Updated metadata extraction to refine the display of contextual information. - Shifted focus in the `TwoWikiMultihopAdapter` from corpus loading to context extraction. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 13:25:47 +01:00
alekszievr	4c3c811c1e	test: eval_framework/evaluation unit tests [cog-1234] (#575 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Tests - Added a suite of tests to validate evaluation logic under various scenarios, including handling of valid inputs and error conditions. - Introduced comprehensive tests verifying the accuracy of evaluation metrics, ensuring reliable scoring and error management. - Created a new test suite for the `DeepEvalAdapter`, covering correctness, unsupported metrics, and error handling. - Added unit tests for `ExactMatchMetric` and `F1ScoreMetric`, parameterized for various test cases. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 13:24:47 +01:00
Igor Ilic	c9aee6fbf4	test: Add testing of cognee telemetry (#573 ) <!-- .github/pull_request_template.md --> ## Description Add testing of cognee telemetry ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Tests - Introduced an automated testing process for telemetry components, running unit tests across multiple environments to ensure consistent performance. The workflow efficiently manages test execution and error reporting, speeding up development cycles. - Chores - Enhanced dependency management and cleanup procedures, significantly contributing to overall system stability, faster feedback cycles, and improved release quality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 13:23:16 +01:00
lxobr	9cc357ac1c	Feat/cog 1365 unify retrievers (#572 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> - Created the `BaseRetriever` class to unify all the retrievers and searches. - Implemented seven specialized retrievers (summaries, chunks, completions, graph, graph-summary, insights, code) with consistent get_context/get_completion interfaces. - Added json context dumping feature in the current completion implementations to enable context comparisons. - Built a comparison framework to validate old vs new implementations. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced multiple retrieval classes for enhanced search capabilities, including `BaseRetriever`, `ChunksRetriever`, `CodeRetriever`, `CompletionRetriever`, `GraphCompletionRetriever`, `GraphSummaryCompletionRetriever`, `InsightsRetriever`, and `SummariesRetriever`. - Enhanced query completions with optional context saving for improved data persistence. - Implemented advanced tools to compare retrieval outcomes across different implementations. - Refactor - Streamlined internal module organization and updated references for increased maintainability and consistency. - Added comments indicating future maintenance tasks related to code merging. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2025-02-27 12:13:21 +01:00
Boris Arzentar	86b34657aa	version: v0.1.30	2025-02-26 21:48:59 +01:00
Boris Arzentar	c2c70a7d22	fix: remove postgres and neo4j from mcp setup	2025-02-26 20:30:16 +01:00
Boris Arzentar	8932a5868c	fix: add missing system dependencies	2025-02-26 20:25:26 +01:00
Boris Arzentar	915384a944	fix: change context of docker build	2025-02-26 20:22:07 +01:00
Boris	711ae8e675	feat: codegraph improvements and new CODE search [COG-1351] (#581 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced an automated deployment workflow to build and push container images. - Updated dependency management to include additional database support. - Refactor - Enhanced asynchronous operations and logging in the server for improved performance. - Optimized extraction and retrieval processes for code-related data. - Chores - Streamlined build configurations and startup scripts for greater reliability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com> Co-authored-by: Igor Ilic <igorilic03@gmail.com>	2025-02-26 20:15:02 +01:00
alekszievr	f6ced4122a	Test: test eval dashboard generation [COG-1234] (#570 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Tests - Introduced a new test suite for validating the metrics dashboard generation. - Added tests for the `bootstrap_ci` function to ensure accurate calculations and handling of various input scenarios. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-26 12:45:34 +01:00
Vasilije	4b777cf214	feat: add validation to llm env variables (#558 ) … needed <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Implemented enhanced configuration validation for environment-based settings. Now, if any configuration parameter is provided via the environment, all required settings must be present. This improvement helps catch misconfigurations early, reducing potential errors and ensuring a smoother, more reliable user experience. These proactive measures significantly enhance overall system stability and performance. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Boris <boris@topoteretes.com>	2025-02-26 06:44:45 +01:00
lxobr	1cb83312fe	feat: add experimental cognify pipeline [COG-1293] (#541 ) <!-- .github/pull_request_template.md --> ## Description - Integrate experimental tasks into the evaluation framework ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced interactive prompt templates for extracting graph nodes, edge triplets, and relationship names, resulting in more comprehensive and accurate knowledge graphs. - Added asynchronous processes to efficiently handle document data and integrate graph components. - Launched cascade graph task options to offer enhanced flexibility in task management workflows. - Added new functionality for extracting content nodes and relationship names from text. - Refactor - Streamlined configurations for prompt processing and task initialization, improving overall modularity and system stability. - Updated task getter mechanisms to utilize function-based approaches for improved flexibility. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com> Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-02-25 16:14:27 +01:00
lxobr	55411ff44b	feat: entity completion skeleton [COG-1318] (#552 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> - Modular implementation of entity completion search - Added base classes that define entity extractors and context providers - Created dummy implementations that return test data - Set up adapters that let us switch between different entity extractors and context providers using strings - Added configuration class to control which implementations to use - Entity completion: query → find entities → get context → interact with LLM → return answer ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced the query completion experience with integrated language model response generation, improved validation, and robust error handling. - Introduced sample modules for context retrieval and entity extraction that simulate key processing steps. - Established foundational abstractions to support flexible context and entity handling strategies. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-02-25 16:07:48 +01:00
alekszievr	a788875117	test: answer generation [COG-1234] (#569 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Tests - Introduced a new asynchronous test to validate the answer generation functionality, ensuring that generated responses align with the provided question-answer pairs. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-25 12:21:36 +01:00
Vasilije	452eaf0735	Update README.md Update .env handling	2025-02-24 22:56:18 +01:00
Boris	9a1e03e403	fix: simplify installation in readme (#577 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Documentation - Enhanced overall clarity and layout of the guide. - Updated text alignment and visual elements, including an updated logo. - Revised header hierarchy for a more intuitive reading experience. - Added detailed installation instructions with specific database support. - Reorganized contributing guidelines and the code of conduct for improved structure. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-24 20:36:22 +01:00
Igor Ilic	4f354ba534	fix: reuse PostgreSQL database connections (#574 ) <!-- .github/pull_request_template.md --> ## Description Fix PostgreSQL database connection problems ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Improved the system’s database connection process to enhance compatibility across multiple relational databases. The application now dynamically selects the optimal connection method—reusing established connections when possible—to ensure improved stability and performance without affecting the public interface. - Streamlined the creation of the embedding engine by removing it as a parameter and generating it internally. - Removed dependency on the embedding engine in the vector engine retrieval process. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-24 20:35:40 +01:00
Vasilije	6e567445b5	Update README.md	2025-02-21 18:51:24 +01:00
alekszievr	a61df966c6	feat: use external chunker [cog-1354] (#551 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a modular content chunking interface that offers flexible text segmentation with configurable chunk size and overlap. - Added new chunkers for enhanced text processing, including `LangchainChunker` and improved `TextChunker`. - Refactor - Unified the chunk extraction mechanism across various document types for improved consistency and type safety. - Updated method signatures to enhance clarity and type safety regarding chunker usage. - Enhanced error handling and logging during text segmentation to guide adjustments when content exceeds limits. - Bug Fixes - Adjusted expected output in tests to reflect changes in chunking logic and configurations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-21 14:10:59 +01:00
hajdul88	eba1515127	feat: quick fix dynamic collection handling in search (#567 ) [COG-1369] <!-- .github/pull_request_template.md --> ## Description Fixes search dynamic collection mapping in graph completion search ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Adjusted graph processing to remove extraneous notifications when expected data elements are absent. - Updated query processing to ensure a more consistent selection of related data types. - Streamlined database error handling by aligning exception management with standard practices. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-21 13:45:42 +01:00
SJ	fd3b15fb58	fix: entrypoint.sh to not fail on first docker up, improved handling of migrations, signals and errors. (#546 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> In it's current form, the entrypoint.sh script will run but fail with exit code 3 on the first docker compose up. Technically, running docker compose up a second time will not throw the same error and the application works fine. The new changes will improve the first time user experience and improve on some other aspects. Summary of Changes: 1- entrypoint.sh to not fail with exit code 3 on first docker up. 2- Improved error and signal handling with set -e. 3- Improved database migration, verification and error handling. Avoids schema version mismatch and ensures db schema is always in sync with application code. 4- Added exec before Gunicorn commands to ensure proper signal handling. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Improved error handling for smoother database migrations and startup. - Updated process management to ensure reliable application launch. - Optimized worker configuration and introduced a startup delay to guarantee database readiness. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: soekja <soekja@users.noreply.github.com> Co-authored-by: soekja <soekja@users.noreply.github.com> Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>	2025-02-21 01:28:15 +01:00
alekszievr	28f92f661e	Test: Mock file download and open in musique adapter (#571 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Tests - Enhanced test coverage to improve adapter instantiation and data loading reliability. - Updated mock testing logic to ensure robust content handling. - Removed an outdated test focused on data limit validation. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-20 16:11:19 +01:00
alekszievr	97db017708	Test: test corpus builder [cog-1234] (#564 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Enhanced the continuous integration workflows with updated dependency management and environment configurations for improved test stability. - Tests - Added parameterized unit tests to verify corpus loading and structure, ensuring more robust handling of test data. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-20 15:16:58 +01:00
alekszievr	17231de5d0	Test: Parse context pieces separately in MusiqueQAAdapter and adjust tests [cog-1234] (#561 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Tests - Updated evaluation checks by removing assertions related to the relationship between `corpus_list` and `qa_pairs`, now focusing solely on `qa_pairs` limits. - Refactor - Improved content processing to append each paragraph individually to `corpus_list`, enhancing clarity in data structure. - Simplified type annotations in the `load_corpus` method across multiple adapters, ensuring consistency in return types. - Chores - Updated dependency installation commands in GitHub Actions workflows for Python 3.10, 3.11, and 3.12 to include additional evaluation-related dependencies. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>	2025-02-20 14:23:53 +01:00
lxobr	e25c7c93fe	fix: correctly add nodes to chunks [COG-1370] (#568 ) <!-- .github/pull_request_template.md --> ## Description - Fix expand_with_nodes_and_edges to correctly add nodes to chunks ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Enhanced the internal processing for data associations to ensure more reliable and consistent handling of connections. - Streamlined the logic to better manage edge cases, improving overall stability and error handling. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-20 12:52:34 +01:00
Igor Ilic	f2e0f47565	fix: test llm connection with gemini (#557 ) <!-- .github/pull_request_template.md --> ## Description Temporary fix for Gemini LLM until they allow empty dictionaries in model schema definition ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - AI responses now adjust their format dynamically based on the type of output, providing a streamlined text display when appropriate. - Extended processing time improves the handling of longer operations for a more reliable interaction. - Bug Fixes - Enhanced error management during connectivity tests ensures a more robust and stable user experience. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Boris <boris@topoteretes.com>	2025-02-20 11:41:29 +01:00
Boris	45f7c63322	fix: notebooks errors (#565 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Automatically creates a blank graph when a file isn’t found, ensuring smoother operations. - Updated demonstration notebooks with dynamic configurations, including refined search operations and input prompts. - Introduced optional support for additional graph functionalities via an integrated dependency. - Refactor - Streamlined processing by eliminating duplicate steps and simplifying graph rendering workflows. - Chores - Updated environment configurations and upgraded the Python runtime for improved performance and consistency. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-19 14:07:11 -08:00
Boris Arzentar	811e932cae	version: v0.1.29	2025-02-19 20:19:51 +01:00

1 2 3 4 5 ...

2211 commits