cognee

Author	SHA1	Message	Date
hajdul88	eba1515127	feat: quick fix dynamic collection handling in search (#567 ) [COG-1369] <!-- .github/pull_request_template.md --> ## Description Fixes search dynamic collection mapping in graph completion search ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Adjusted graph processing to remove extraneous notifications when expected data elements are absent. - Updated query processing to ensure a more consistent selection of related data types. - Streamlined database error handling by aligning exception management with standard practices. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-21 13:45:42 +01:00
Boris	45f7c63322	fix: notebooks errors (#565 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Automatically creates a blank graph when a file isn’t found, ensuring smoother operations. - Updated demonstration notebooks with dynamic configurations, including refined search operations and input prompts. - Introduced optional support for additional graph functionalities via an integrated dependency. - Refactor - Streamlined processing by eliminating duplicate steps and simplifying graph rendering workflows. - Chores - Updated environment configurations and upgraded the Python runtime for improved performance and consistency. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-19 14:07:11 -08:00
alekszievr	e56d86b410	feat: Implement optional neo4j metrics and improve tests [cog-1262] (#556 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced graph analytics now offer detailed metrics—including shortest path lengths, diameter, and clustering coefficients—to provide deeper insights. - Added new functions for creating connected test graphs and validating metrics against predefined ground truth values. - Introduced a new JSON file containing metrics for connected and disconnected graph structures. - Improvements - Updated how graphs are projected to consistently use undirected representations, ensuring more accurate and reliable metric calculations. - Streamlined metric consistency checks across different graph processing methods for robust, reliable results. - Simplified testing logic by consolidating metric assertions into a single function call. - Chores - Removed unnecessary secret variables from the workflow configuration, potentially affecting access to certain resources. - Updated secret management to include the new `OPENAI_API_KEY`. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-19 16:24:59 +01:00
hajdul88	0bcaf5c477	Feature/cog 1358 local ollama model support for cognee (#555 ) <!-- .github/pull_request_template.md --> This PR contains the ollama specific llm adapter together with the embedding engine. Tested with the following models: `LLM_API_KEY="ollama" llm_model = "llama3.1:8b" LLM_PROVIDER = "ollama" llm_endpoint = "http://localhost:11434/v1" EMBEDDING_PROVIDER="ollama" EMBEDDING_MODEL="avr/sfr-embedding-mistral:latest" EMBEDDING_ENDPOINT="http://localhost:11434/api/embeddings" EMBEDDING_DIMENSIONS=4096 HUGGINGFACE_TOKENIZER="Salesforce/SFR-Embedding-Mistral"` ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a new embedding option that leverages an external provider for asynchronous text processing. - Added enhanced language model integration using a dedicated adapter to improve interaction quality. - Enhancements - Expanded configuration settings to include a new tokenizer option. - Updated provider selection logic to incorporate the additional embedding and language model features. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com> Co-authored-by: vasilije <vas.markovic@gmail.com>	2025-02-19 02:54:04 +01:00
Boris	f9e6dcf837	fix: simplify code pipeline (#529 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced code search and dependency analysis for improved accuracy. - Introduced a new high-performance text embedding option. - Added an additional execution entry point for code graph processing. - New optional parameters for flexible property selection in retrieval functions. - Introduced new classes for handling import statements, function definitions, and class definitions. - Updated embedding engine selection based on configuration options. - Bug Fixes - Improved error handling in search operations and database queries for a more stable user experience. - Enhanced error logging for source code parsing. - Refactor - Streamlined asynchronous processing and refactored internal dependency extraction. - Updated configuration and integration settings to enhance overall reliability. - Restructured functions for simplified dependency handling. - Chores - Upgraded and reorganized dependency management with optional libraries for extended functionality. - Added new secret parameters for embedding configuration in workflow settings. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: vasilije <vas.markovic@gmail.com>	2025-02-12 23:58:48 +01:00
Boris	8f84713b54	fix: support structured data conversion to data points (#512 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced version tracking and enhanced metadata in core data models for improved data consistency. - Bug Fixes - Improved error handling during graph data loading to prevent disruptions from unexpected identifier formats. - Refactor - Centralized identifier parsing and streamlined model definitions, ensuring smoother and more consistent operations across search, retrieval, and indexing workflows. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-10 17:16:13 +01:00
Boris	f75e35c337	fix: custom model pipeline (#508 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features • Graph visualizations now allow exporting to a user-specified file path for more flexible output management. • The text embedding process has been enhanced with an additional tokenizer option for improved performance. • A new `ExtendableDataPoint` class has been introduced for future extensions. • New JSON files for companies and individuals have been added to facilitate testing and data processing. - Improvements • Search functionality now uses updated identifiers for more reliable content retrieval. • Metadata handling has been streamlined across various classes by removing unnecessary type specifications. • Enhanced serialization of properties in the Neo4j adapter for improved handling of complex structures. • The setup process for databases has been improved with a new asynchronous setup function. - Chores • Dependency and configuration updates improve overall stability and performance. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-08 02:00:15 +01:00
alekszievr	2e842652be	Fix diameter and shortest path calculation in networkx adapter [COG-1201] (#507 ) …nnected graph <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes - Enhanced reliability of graph metric calculations to gracefully handle unexpected inputs, ensuring smoother and uninterrupted graph analysis for end-users. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-08 00:15:26 +01:00
alekszievr	8396fed9a1	feat: metrics in neo4j adapter [COG-1082] (#487 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced graph management capabilities allow users to verify graph existence, project complete graphs, and remove graphs, delivering more comprehensive graph insights. - Refactor - Adjusted default task behavior for streamlined performance. - Updated timestamp handling to ensure accurate and consistent record tracking. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>	2025-02-07 15:58:43 +01:00
Igor Ilic	df163b0431	Add pydantic settings checker (#497 ) <!-- .github/pull_request_template.md --> ## Description Add test of embedding and LLM model at beginning of cognee use Fix issue with relational database async use Refactor handling of cache mechanism for all databases so changes in config can be reflected in get functions ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced connection testing for language and embedding services at startup, ensuring improved reliability during data addition. - Refactor - Streamlined engine initialization across multiple database systems to enhance performance and clarity. - Improved parameter handling and caching strategies for faster, more consistent operations. - Updated record identifiers for more robust and unique data storage. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: holchan <61059652+holchan@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2025-02-04 23:18:27 +01:00
Igor Ilic	1260fc7db0	fix: Add reraising of general exception handling in cognee [COG-1062] (#490 ) <!-- .github/pull_request_template.md --> ## Description Add re-raising of errors in general exception handling ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes & Stability Improvements - Enhanced error handling throughout the system, ensuring issues during operations like server startup, data processing, and graph management are properly logged and reported. - Refactor - Standardized logging practices replace basic output statements, improving traceability and providing better insights for troubleshooting. - New Features - Updated search functionality now returns only unique results, enhancing data consistency and the overall user experience. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: holchan <61059652+holchan@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2025-02-04 10:51:05 +01:00
alekszievr	2858a674f5	feat: Calculate graph metrics for networkx graph [COG-1082] (#484 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enabled an option to retrieve more detailed metrics, providing comprehensive analytics for graph and descriptive data. - Refactor - Standardized the way metrics are obtained across components for consistent behavior and improved data accuracy. - Chore - Made internal enhancements to support optional detailed metric calculations, streamlining system performance and ensuring future scalability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>	2025-02-03 18:05:53 +01:00
alekszievr	5119992fd8	feat: Add graph metrics getter in graph db interface and adapters [COG-1082] (#483 ) Dummy implementation of graph metrics to demonstrate how the interface will look like <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced asynchronous functionality for retrieving comprehensive graph metrics, including counts and connectivity details, across different systems. - Refactor - Streamlined metrics processing and storage by shifting to direct retrieval from the graph engine. - Updated naming conventions for the `GraphMetrics` database table and reorganized module imports to enhance internal consistency. - Chores - Removed dataset deletion functionalities while introducing the ability to store descriptive metrics. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>	2025-02-03 15:25:04 +01:00
Igor Ilic	8879f3fbbe	feat: Add gemini support [COG-1023] (#485 ) <!-- .github/pull_request_template.md --> ## Description PR to test Gemini PR from holchan 1. Add Gemini LLM and Gemini Embedding support 2. Fix CodeGraph issue with chunks being bigger than maximum token value 3. Add Tokenizer adapters to CodeGraph ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added support for the Gemini LLM provider. - Expanded LLM configuration options. - Introduced a new GitHub Actions workflow for multimetric QA evaluation. - Added new environment variables for LLM and embedding configurations across various workflows. - Bug Fixes - Improved error handling in various components. - Updated tokenization and embedding processes. - Removed warning related to missing `dict` method in data items. - Refactor - Simplified token extraction and decoding methods. - Updated tokenizer interfaces. - Removed deprecated dependencies. - Enhanced retry logic and error handling in embedding processes. - Documentation - Updated configuration comments and settings. - Chores - Updated GitHub Actions workflows to accommodate new secrets and environment variables. - Modified evaluation parameters. - Adjusted dependency management for optional libraries. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: holchan <61059652+holchan@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2025-01-31 18:03:23 +01:00
hajdul88	f843c256e4	feat: Use unwind for batch edge save and add unit tests for get_graph_from_model * feat: adds some unit tests for get_graph_from_model * feat: updates neo4j add_edges cypher and deletes shallow get_graph_from_model * fix: fixing merge conflict false resolve * chore: deletes old only_root unit test	2025-01-31 13:14:04 +01:00
Igor Ilic	49f60971bb	Merge branch 'dev' into COG-970-refactor-tokenizing	2025-01-28 10:12:55 +01:00
Igor Ilic	0a9f1349f2	refactor: Change variable and function names based on PR comments Change variable and function names based on PR comments	2025-01-28 10:10:29 +01:00
Igor Ilic	89d4b7a5c4	Merge branch 'dev' into pgvector-add-normalization	2025-01-24 19:24:39 +01:00
Igor Ilic	23ecf245ed	fix: Return string conversion to resolve traceback	2025-01-24 19:20:55 +01:00
Igor Ilic	b0cec3fcaa	refactor: Remove conversion to string	2025-01-24 19:03:57 +01:00
Igor Ilic	ffbb387580	Merge branch 'dev' into fix-insert-data	2025-01-24 18:55:41 +01:00
Igor Ilic	7dea1d54d7	refactor: Add specific max token values to embedding models	2025-01-23 18:18:45 +01:00
Igor Ilic	6d5679f9d2	Merge branch 'dev' into COG-970-refactor-tokenizing	2025-01-23 18:14:49 +01:00
Igor Ilic	1319944dcd	docs: Update .env.template to include llm and embedding options	2025-01-23 18:05:45 +01:00
Igor Ilic	b686376c54	feat: Add gemini tokenizer to cognee	2025-01-23 17:55:04 +01:00
Igor Ilic	294ed1d960	feat: Add HuggingFace Tokenizer support	2025-01-23 16:52:35 +01:00
Igor Ilic	2e1a48e22c	docs: Add usage example of function	2025-01-23 15:13:46 +01:00
Igor Ilic	de19016494	fix: Add flag to allow SQLite to use foreign keys	2025-01-23 15:10:27 +01:00
Igor Ilic	d4453e4a1d	fix: Add support for SQLite and PostgreSQL for inserting data in SQLAlchemyAdapter	2025-01-23 14:59:02 +01:00
Boris Arzentar	e577276d91	Merge remote-tracking branch 'origin/dev' into feat/COG-1058-fastmcp	2025-01-23 11:46:25 +01:00
Boris Arzentar	00f302c37a	feat: use fastmcp for mcp server	2025-01-23 11:45:40 +01:00
Igor Ilic	9f6a0ba783	Merge branch 'dev' into pgvector-add-normalization	2025-01-23 11:11:43 +01:00
Igor Ilic	93249c72c5	fix: Initial commit to resolve issue with using tokenizer based on LLMs Currently TikToken is used for tokenizing by default which is only supported by OpenAI, this is an initial commit in an attempt to add Cognee tokenizing support for multiple LLMs	2025-01-21 19:53:22 +01:00
Igor Ilic	bd3a5a758c	Merge branch 'dev' into COG-793-metadata-rework	2025-01-20 18:06:21 +01:00
Igor Ilic	49ad292592	refactor: Reduce complexity of metadata handling Have foreign metadata be a table column in data instead of it's own table to reduce complexity Refactor COG-793	2025-01-20 16:39:05 +01:00
hajdul88	813a03c6e2	Merge branch 'dev' into pgvector-add-normalization	2025-01-20 13:46:50 +01:00
Igor Ilic	2546844787	feat: Add normalization to PGVector search Add normalization to PGVector search results	2025-01-20 13:42:39 +01:00
hajdul88	bf70705ed0	Fix: fixes networkx failed to load graph from file error	2025-01-20 12:19:34 +01:00
hajdul88	6e691885e6	Merge branch 'dev' into feature/cog-186-run-cognee-on-windows	2025-01-17 09:06:00 +01:00
hajdul88	935763b08d	fix: fixing changed lancedb search + pruning	2025-01-16 17:32:44 +01:00
vasilije	0a02886d76	Update format	2025-01-16 13:28:35 +01:00
hajdul88	124a26335e	feat: changes model independent edge method	2025-01-14 13:58:56 +01:00
hajdul88	5e9471ebad	fix: removes get_model_independent_graph method from abstract class as graphiti does not support networkx	2025-01-14 09:16:46 +01:00
hajdul88	c351047c36	feat: adds cognee node and edge embeddings for graphiti graph	2025-01-13 17:22:59 +01:00
Rita Aleksziev	872bc89648	Format with Ruff 0.9.0	2025-01-10 15:11:00 +01:00
vasilije	76a0aa7e8b	Fix linter issues	2025-01-05 19:48:35 +01:00
vasilije	6dafe73a6b	Fix linter issues	2025-01-05 19:24:55 +01:00
vasilije	649fcf2ba8	Fix linter issues	2025-01-05 19:21:09 +01:00
vasilije	60c8fd103b	ruff format	2025-01-05 19:09:08 +01:00
Vasilije	e631161eb1	Merge branch 'dev' into COG-475-local-file-endpoint-deletion	2024-12-26 21:05:57 +01:00

1 2 3 4 5

248 commits