cognee

Author	SHA1	Message	Date
Boris Arzentar	2e4aab9a9a	fix: example ruff errors	2025-03-11 16:44:00 +01:00
hibajamal	56427f287e	Demo for relational db with cognee (#620 ) <!-- .github/pull_request_template.md --> ## Description This demo uses pydantic models and dlt to pull data from the Pokémon API and structure it into a relational format. By feeding this structured data into cognee, it makes searching across multiple tables easier and more intuitive, thanks to the relational model. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a comprehensive Pokémon data processing pipeline, available as both a Python script and an interactive Jupyter Notebook. - Enabled asynchronous operations for efficient data collection and querying, including an integrated search functionality. - Improved error handling and data validation during the data fetching and processing stages for a smoother user experience. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>	2025-03-08 20:33:42 +01:00
lxobr	f033f733b5	feat: entity brute force triplet search [COG-1325] (#589 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> - Refactored `brute_force_triplet_search`, extracting memory projection. - Built TripletSearchContextProvider (extends BaseContextProvider) to create a single memory projection and perform a triplet search for each entity. - Refactored `entity_completion` into EntityCompletionRetriever (extends BaseRetriever). - Added SummarizedTripletSearchContextProvider (extends TripletSearchContextProvider) for an alternative summarized output format. - Developed and tested an example showcasing both context providers, comparing raw triplets, summaries, and standard search results. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced text summarization now delivers clearer, more concise overviews of search results. - Improved search performance with optimized context retrieval and memory reuse for faster, more reliable results. - Introduced advanced entity-based completion for generating more relevant, context-aware responses. - Refactor - Streamlined internal workflows and error handling to ensure a smoother overall experience. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Boris <boris@topoteretes.com>	2025-03-05 11:17:58 +01:00
alekszievr	6d7a68dbba	Feat: Store descriptive metrics identified by pipeline run id [cog-1260] (#582 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a new analytic capability that calculates descriptive graph metrics for pipeline runs when enabled. - Updated the execution flow to include an option for activating the graph metrics step. - Chores - Removed the previous mechanism for storing descriptive metrics to streamline the system. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2025-03-03 19:09:35 +01:00
Daniel Molnar	d27f847753	Transition to new retrievers, update searches (#585 ) <!-- .github/pull_request_template.md --> ## Description Delete legacy search implementations after migrating to new retriever classes ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced search and retrieval capabilities, providing improved context resolution for code queries, completions, summaries, and graph connections. - Refactor - Shifted to a modular, object-oriented approach that consolidates query logic and streamlines error management for a more robust and scalable experience. - Bug Fixes - Improved error handling for unsupported search types and retrieval operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 15:25:24 +01:00
lxobr	9cc357ac1c	Feat/cog 1365 unify retrievers (#572 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> - Created the `BaseRetriever` class to unify all the retrievers and searches. - Implemented seven specialized retrievers (summaries, chunks, completions, graph, graph-summary, insights, code) with consistent get_context/get_completion interfaces. - Added json context dumping feature in the current completion implementations to enable context comparisons. - Built a comparison framework to validate old vs new implementations. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced multiple retrieval classes for enhanced search capabilities, including `BaseRetriever`, `ChunksRetriever`, `CodeRetriever`, `CompletionRetriever`, `GraphCompletionRetriever`, `GraphSummaryCompletionRetriever`, `InsightsRetriever`, and `SummariesRetriever`. - Enhanced query completions with optional context saving for improved data persistence. - Implemented advanced tools to compare retrieval outcomes across different implementations. - Refactor - Streamlined internal module organization and updated references for increased maintainability and consistency. - Added comments indicating future maintenance tasks related to code merging. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2025-02-27 12:13:21 +01:00
Boris	ada466879e	fix: add default params to run_tasks (#563 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced the task execution process by enabling default values for certain parameters, allowing users to trigger task processing without supplying every input explicitly. - Bug Fixes - Adjusted asynchronous handling for the `retrieved_edges_to_string` function to ensure proper execution flow in various components. - Documentation - Updated markdown formatting in the Jupyter notebook for improved readability and structure. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-02-19 20:18:51 +01:00
alekszievr	05ba29af01	Feat: log pipeline status and pass it through pipeline [COG-1214] (#501 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced pipeline execution now provides consolidated status feedback with improved telemetry for start, completion, and error events. - Automatic generation of unique dataset identifiers offers clearer task and pipeline run associations. - Refactor - Task execution has been streamlined with explicit parameter handling for more structured pipeline processing. - Interactive examples and demos now return results directly, making integration and monitoring more accessible. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>	2025-02-11 16:41:40 +01:00
Igor Ilic	3850e9c7a1	Cognee simple document example (#521 ) <!-- .github/pull_request_template.md --> ## Description Notebook and python example for cognee simple example ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced an interactive demo showcasing asynchronous document processing and querying for key insights from a sample text. - Documentation - Added an in-depth, step-by-step guide in a Jupyter Notebook that walks users through setup, configuration, querying, and visualizing processed data. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-11 13:58:35 +01:00
Igor Ilic	5fe7ff9883	refactor: Refactor search so graph completion is used by default (#505 ) <!-- .github/pull_request_template.md --> ## Description Refactor search so query type doesn't need to be provided to make it simpler for new users ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Improved the search interface by standardizing parameter usage with explicit keyword arguments for specifying search types, enhancing clarity and consistency. - Tests - Updated test cases and example integrations to align with the revised search parameters, ensuring consistent behavior and reliable validation of search outcomes. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-07 17:16:34 +01:00
hajdul88	08c22a542a	fix: fixes typo in multimedia example	2025-01-17 09:31:48 +01:00
hajdul88	981f35c1e0	fix: fixes windows compatibility in examples	2025-01-17 09:28:10 +01:00
hajdul88	bd6aafe9b7	fix: fixes event loop handling on windows in dynamic steps example	2025-01-16 18:17:11 +01:00
hajdul88	1db44de7de	feat: adds graphiti demo notebook	2025-01-15 11:45:06 +01:00
hajdul88	d0646a1694	feat: Implements generation and retrieval and adjusts imports	2025-01-14 13:59:27 +01:00
hajdul88	c351047c36	feat: adds cognee node and edge embeddings for graphiti graph	2025-01-13 17:22:59 +01:00
hajdul88	ea8628c527	Fix: Fixes logging setup	2025-01-13 09:49:56 +01:00
Rita Aleksziev	a11b914f39	Merge branch 'dev' into COG-949	2025-01-10 10:02:56 +01:00
hajdul88	341f30fcdc	fix: Fixes ruff formatting	2025-01-09 12:00:49 +01:00
hajdul88	fe57eb69e7	Merge branch 'dev' into feature/cog-967-adding-graph-completion-feature-to-cognee	2025-01-09 11:07:19 +01:00
Rita Aleksziev	5635da6e38	Adjust unit tests	2025-01-09 10:53:03 +01:00
hajdul88	d39140f28b	feat: implements the first version of graph based completion in search	2025-01-08 16:10:29 +01:00
vasilije	41b1486cff	Fix visualization	2025-01-08 13:13:52 +01:00
Rita Aleksziev	f4397bf940	Remove setting envvars from arg	2025-01-08 12:33:14 +01:00
Rita Aleksziev	8ffef5034a	Add clean logging to code graph example	2025-01-08 12:25:31 +01:00
hajdul88	18c8bc3c33	Merge branch 'dev' into COG-adding_html_graph_render	2025-01-08 10:44:11 +01:00
alekszievr	0dec704445	Merge branch 'dev' into COG-949	2025-01-08 10:21:07 +01:00
hajdul88	58da2d9e57	fix: Fixes faulty logging format and sets up error logging in dynamic steps example	2025-01-07 11:01:37 +01:00
lxobr	5e79dc53c5	feat: time code graph run and add mock support	2025-01-06 11:25:04 +01:00
vasilije	60c8fd103b	ruff format	2025-01-05 19:09:08 +01:00
lxobr	262deee26e	Cog 813 source code chunks (#383 ) * fix: pass the list of all CodeFiles to enrichment task * feat: introduce SourceCodeChunk, update metadata * feat: get_source_code_chunks code graph pipeline task * feat: integrate get_source_code_chunks task, comment out summarize_code * Fix code summarization (#387) * feat: update data models * feat: naive parse long strings in source code * fix: get_non_py_files instead of get_non_code_files * fix: limit recursion, add comment * handle embedding empty input error (#398) * feat: robustly handle CodeFile source code * refactor: sort imports * todo: add support for other embedding models * feat: add custom logger * feat: add robustness to get_source_code_chunks Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * feat: improve embedding exceptions * refactor: format indents, rename module --------- Co-authored-by: alekszievr <44192193+alekszievr@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2024-12-26 13:53:38 +01:00
alekszievr	de2394c392	Ingest non-code files (#395 ) * Ingest non-code files * Fixing review findings	2024-12-20 14:06:40 +01:00
lxobr	da5e3ab24d	COG 870 Remove duplicate edges from the code graph (#293 ) * feat: turn summarize_code into generator * feat: extract run_code_graph_pipeline, update the pipeline * feat: minimal code graph example * refactor: update argument * refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline * refactor: indentation and whitespace nits * refactor: add deprecated use comments and warnings --------- Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com> Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2024-12-17 12:02:25 +01:00
hajdul88	68c3f42ab8	Merge branch 'main' into feature/cog-717-create-edge-embeddings-in-vector-databases	2024-12-05 09:08:37 +01:00
Rita Aleksziev	dd94781033	Integrate graphiti's functionality as Tasks	2024-12-04 16:33:26 +01:00
hajdul88	46ee513f6c	chore: deletes comment from dynamic_steps_example	2024-12-04 14:59:01 +01:00
hajdul88	59f8ec665f	Merge remote-tracking branch 'origin/main' into feature/cog-537-implement-retrieval-algorithm-from-research-paper	2024-11-26 16:38:32 +01:00
hajdul88	db07179856	chore: Adds error handling to brute force triplet search	2024-11-26 16:17:57 +01:00
hajdul88	c66c43e717	chore: places retrievers under modules directory	2024-11-26 15:44:11 +01:00
hajdul88	a59517409c	chore: Fixes some of the issues based on PR review + restructures things	2024-11-26 14:45:48 +01:00
Vasilije	9d6081c7f7	feat: Add support for multiple audio and image formats (#12 ) Added support for multiple audio and image formats with example The formats added are the possible filetype library return values for extension for Audio and Images Feature COG-507	2024-11-23 16:31:55 +01:00
hande-k	157d7d217d	docs: added cognify steps in the print statement and commented example output	2024-11-21 13:57:42 +01:00
hajdul88	b5d9e7a6d2	chore: adds return value and sets tue entry point kg generation to true	2024-11-20 19:03:32 +01:00
hajdul88	a114d68aef	feat: Implements basic global triplet optimizing retrieval	2024-11-20 18:33:34 +01:00
Igor Ilic	57783a979a	feat: Add support for multiple audio and image formats Added support for multiple audio and image formats with example Feature COG-507	2024-11-20 14:03:14 +01:00
hande-k	c6e447f28c	docs: add print statements to the simple example, update README	2024-11-20 08:47:02 +01:00
hajdul88	c4850f64dc	feat: Implements pipeline structure for retrievers	2024-11-19 11:14:42 +01:00
Rita Aleksziev	07b1956b6e	Fix syntax in simple example	2024-11-19 09:55:21 +01:00
Boris	c045f737f7	feat: add vector and graph dbs state to README file (#235 )	2024-11-18 17:51:41 +01:00
Igor Ilic	d30adb53f3	Cog 337 llama index support (#186 ) * feat: Add support for LlamaIndex Document type Added support for LlamaIndex Document type Feature #COG-337 * docs: Add Jupyer Notebook for cognee with llama index document type Added jupyter notebook which demonstrates cognee with LlamaIndex document type usage Docs #COG-337 * feat: Add metadata migration from LlamaIndex document type Allow usage of metadata from LlamaIndex documents Feature #COG-337 * refactor: Change llama index migration function name Change name of llama index function Refactor #COG-337 * chore: Add llama index core dependency Downgrade needed on tenacity and instructor modules to support llama index Chore #COG-337 * Feature: Add ingest_data_with_metadata task Added task that will have access to metadata if data is provided from different data ingestion tools Feature #COG-337 * docs: Add description on why specific type checking is done Explained why specific type checking is used instead of isinstance, as isinstace returns True for child classes as well Docs #COG-337 * fix: Add missing parameter to function call Added missing parameter to function call Fix #COG-337 * refactor: Move storing of data from async to sync function Moved data storing from async to sync Refactor #COG-337 * refactor: Pretend ingest_data was changes instead of having two tasks Refactor so ingest_data file was modified instead of having two ingest tasks Refactor #COG-337 * refactor: Use old name for data ingestion with metadata Merged new and old data ingestion tasks into one Refactor #COG-337 * refactor: Return ingest_data and save_data_to_storage Tasks Returned ingest_data and save_data_to_storage tasks Refactor #COG-337 * refactor: Return previous ingestion Tasks to add function Returned previous ignestion tasks to add function Refactor #COG-337 * fix: Remove dict and use string for search query Remove dictionary and use string for query in notebook and simple example Fix COG-337 * refactor: Add changes request in pull request Added the following changes that were requested in pull request: Added synchronize label, Made uniform syntax in if statement in workflow, fixed instructor dependency, added llama-index to be optional Refactor COG-337 * fix: Resolve issue with llama-index being mandatory Resolve issue with llama-index being mandatory to run cognee Fix COG-337 * fix: Add install of llama-index to notebook Removed additional references to llama-index from core cognee lib. Added llama-index-core install from notebook Fix COG-337 ---------	2024-11-17 11:47:08 +01:00

1 2

53 commits