Commit graph

84 commits

Author SHA1 Message Date
Boris
f9e6dcf837
fix: simplify code pipeline (#529)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
  - Enhanced code search and dependency analysis for improved accuracy.
  - Introduced a new high-performance text embedding option.
  - Added an additional execution entry point for code graph processing.
- New optional parameters for flexible property selection in retrieval
functions.
- Introduced new classes for handling import statements, function
definitions, and class definitions.
  - Updated embedding engine selection based on configuration options.

- **Bug Fixes**
- Improved error handling in search operations and database queries for
a more stable user experience.
  - Enhanced error logging for source code parsing.

- **Refactor**
- Streamlined asynchronous processing and refactored internal dependency
extraction.
- Updated configuration and integration settings to enhance overall
reliability.
  - Restructured functions for simplified dependency handling.

- **Chores**
- Upgraded and reorganized dependency management with optional libraries
for extended functionality.
- Added new secret parameters for embedding configuration in workflow
settings.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: vasilije <vas.markovic@gmail.com>
2025-02-12 23:58:48 +01:00
Vasilije
9ba2e0d6c1
chore: Fix and update visualization (#518)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced enhanced visualization capabilities that let users launch a
dedicated server for visual displays.
  
- **Documentation**
- Updated several interactive notebooks to include execution outputs and
expanded explanatory content for better user guidance.
  
- **Style**
- Refined formatting and layout across notebooks to ensure consistent
presentation and improved readability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-02-11 19:25:01 +01:00
Igor Ilic
2a04fa3738
fix: Resolve llama-index integration issue with new cognee version [ COG-1270] (#515)
<!-- .github/pull_request_template.md -->

## Description
Change version to latest llama index cognee integration version which
has a proper fix for the failing notebook

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated an AI integration dependency to version 0.1.3 in both the
testing workflow and the Jupyter notebook, ensuring that the environment
uses the latest version for improved consistency during tests.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-02-11 18:28:31 +01:00
alekszievr
05ba29af01
Feat: log pipeline status and pass it through pipeline [COG-1214] (#501)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced pipeline execution now provides consolidated status feedback
with improved telemetry for start, completion, and error events.
- Automatic generation of unique dataset identifiers offers clearer task
and pipeline run associations.

- **Refactor**
- Task execution has been streamlined with explicit parameter handling
for more structured pipeline processing.
- Interactive examples and demos now return results directly, making
integration and monitoring more accessible.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2025-02-11 16:41:40 +01:00
Igor Ilic
3850e9c7a1
Cognee simple document example (#521)
<!-- .github/pull_request_template.md -->

## Description
Notebook and python example for cognee simple example

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced an interactive demo showcasing asynchronous document
processing and querying for key insights from a sample text.
- **Documentation**
- Added an in-depth, step-by-step guide in a Jupyter Notebook that walks
users through setup, configuration, querying, and visualizing processed
data.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-11 13:58:35 +01:00
Boris
f75e35c337
fix: custom model pipeline (#508)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
• Graph visualizations now allow exporting to a user-specified file path
for more flexible output management.
• The text embedding process has been enhanced with an additional
tokenizer option for improved performance.
• A new `ExtendableDataPoint` class has been introduced for future
extensions.
• New JSON files for companies and individuals have been added to
facilitate testing and data processing.

- **Improvements**
• Search functionality now uses updated identifiers for more reliable
content retrieval.
• Metadata handling has been streamlined across various classes by
removing unnecessary type specifications.
• Enhanced serialization of properties in the Neo4j adapter for improved
handling of complex structures.
• The setup process for databases has been improved with a new
asynchronous setup function.

- **Chores**
• Dependency and configuration updates improve overall stability and
performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-08 02:00:15 +01:00
Igor Ilic
5fe7ff9883
refactor: Refactor search so graph completion is used by default (#505)
<!-- .github/pull_request_template.md -->

## Description
Refactor search so query type doesn't need to be provided to make it
simpler for new users

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Improved the search interface by standardizing parameter usage with
explicit keyword arguments for specifying search types, enhancing
clarity and consistency.
- **Tests**
- Updated test cases and example integrations to align with the revised
search parameters, ensuring consistent behavior and reliable validation
of search outcomes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-07 17:16:34 +01:00
Igor Ilic
3e29c3d8f2 docs: Update notebook to work with changes to max chunk tokens 2025-01-28 15:38:38 +01:00
Hande
0fb19ca21d
docs: update readme with "How Cognee Solves Real-World Pain Points" 2025-01-27 11:36:11 +01:00
hande-k
52e5b5c6f4 edit cognee_hotpot_eval.ipynb 2025-01-24 13:46:50 +01:00
alekszievr
bc1b05437e
Merge branch 'dev' into cog-1069-update-notebooks-evals 2025-01-23 20:32:30 +01:00
hande-k
cdecf5fb8f add short decsription in md 2025-01-23 11:17:48 +01:00
hande-k
343de01d5a update notebooks with latest eval 2025-01-23 11:11:51 +01:00
Igor Ilic
77f0b45a0d refactor: Resolve issue with notebook after metadata refactor
Resolve issue with LlamaIndex notebook after refactor
2025-01-20 18:02:57 +01:00
hajdul88
9e63bacaa7
Merge branch 'dev' into feature/cog-761-project-graphiti-graph-to-memory 2025-01-15 11:49:10 +01:00
hajdul88
1db44de7de feat: adds graphiti demo notebook 2025-01-15 11:45:06 +01:00
Igor Ilic
259414add0 docs: Update LlamaIndex integration notebook 2025-01-14 15:32:27 +01:00
Igor Ilic
adee79d7a5 fix: Change nbformat on llama index integration notebook 2025-01-13 11:54:05 +01:00
Igor Ilic
9a4613a9dd docs: Add LlamaIndex Cognee integration notebook
Added LlamaIndex Cognee integration notebook
2025-01-10 16:49:23 +01:00
vasilije
cbd15b98a5 Fix linter issues 2025-01-05 20:24:04 +01:00
vasilije
2675836149 Fix linter issues 2025-01-05 20:17:49 +01:00
vasilije
60c8fd103b ruff format 2025-01-05 19:09:08 +01:00
Vasilije
d3739d91c9
Merge branch 'dev' into test 2024-12-19 19:05:26 +01:00
vasilije
ae886ba585 Fix langfuse 2024-12-19 19:03:41 +01:00
vasilije
f9fea0a0c8 Fix langfuse 2024-12-19 19:03:26 +01:00
Rita Aleksziev
e20403d0f4 Rename eval notebook 2024-12-19 14:23:55 +01:00
Rita Aleksziev
b756927293 Add evaluation notebook 2024-12-19 14:22:33 +01:00
Boris Arzentar
aa46bb3d64 fix: enable checks for dev 2024-12-12 12:46:26 +01:00
Boris
348610e73c
fix: refactor get_graph_from_model to return nodes and edges correctly (#257)
* fix: handle rate limit error coming from llm model

* fix: fixes lost edges and nodes in get_graph_from_model

* fix: fixes database pruning issue in pgvector (#261)

* fix: cognee_demo notebook pipeline is not saving summaries

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-12-06 12:52:01 +01:00
Boris Arzentar
d49ab4c3b5 feat: update code-graph notebook 2024-12-03 23:48:12 +01:00
Vasilije
9d6081c7f7
feat: Add support for multiple audio and image formats (#12)
Added support for multiple audio and image formats with example

The formats added are the possible filetype library return values for
extension for Audio and Images

Feature COG-507
2024-11-23 16:31:55 +01:00
Igor Ilic
61ed516d12 docs: Add multimedia notebook
Added multimedia notebook for cognee

Docs COG-507
2024-11-20 16:21:29 +01:00
Igor Ilic
f9353d25fa fix: Update table name in notebook
Update table name in notebook

Fix COG-677
2024-11-20 15:14:38 +01:00
Igor Ilic
70fe6ac541 fix: Update table name in notebook
Update table name to use latest in notebook

Fix COG-677
2024-11-20 15:07:38 +01:00
Igor Ilic
4b55354dce
fix: Resolve issue with pgvector timeout (#3)
By creating PGVector as a singleton all issues regrading timeout are
resolved as there are no more parallel instances trying to communicate
with the database
2024-11-19 15:31:26 +01:00
Boris
5f144a0f92
fix: make all checks green (#1) 2024-11-19 15:30:09 +01:00
Boris
d8b6eeded5
feat: log search queries and results (#166)
* feat: log search queries and results

* fix: address coderabbit review comments

* fix: parse UUID when logging search results

* fix: remove custom UUID type and use DB agnostic UUID from sqlalchemy

* Add new cognee_db

---------

Co-authored-by: Leon Luithlen <leon@topoteretes.com>
2024-11-17 11:59:10 +01:00
Igor Ilic
d30adb53f3
Cog 337 llama index support (#186)
* feat: Add support for LlamaIndex Document type

Added support for LlamaIndex Document type

Feature #COG-337

* docs: Add Jupyer Notebook for cognee with llama index document type

Added jupyter notebook which demonstrates cognee with LlamaIndex document type usage

Docs #COG-337

* feat: Add metadata migration from LlamaIndex document type

Allow usage of metadata from LlamaIndex documents

Feature #COG-337

* refactor: Change llama index migration function name

Change name of llama index function

Refactor #COG-337

* chore: Add llama index core dependency

Downgrade needed on tenacity and instructor modules to support llama index

Chore #COG-337

* Feature: Add ingest_data_with_metadata task

Added task that will have access to metadata if data is provided from different data ingestion tools

Feature #COG-337

* docs: Add description on why specific type checking is done

Explained why specific type checking is used instead of isinstance, as isinstace returns True for child classes as well

Docs #COG-337

* fix: Add missing parameter to function call

Added missing parameter to function call

Fix #COG-337

* refactor: Move storing of data from async to sync function

Moved data storing from async to sync

Refactor #COG-337

* refactor: Pretend ingest_data was changes instead of having two tasks

Refactor so ingest_data file was modified instead of having two ingest tasks

Refactor #COG-337

* refactor: Use old name for data ingestion with metadata

Merged new and old data ingestion tasks into one

Refactor #COG-337

* refactor: Return ingest_data and save_data_to_storage Tasks

Returned ingest_data and save_data_to_storage tasks

Refactor #COG-337

* refactor: Return previous ingestion Tasks to add function

Returned previous ignestion tasks to add function

Refactor #COG-337

* fix: Remove dict and use string for search query

Remove dictionary and use string for query in notebook and simple example

Fix COG-337

* refactor: Add changes request in pull request

Added the following changes that were requested in pull request:

Added synchronize label,
Made uniform syntax in if statement in workflow,
fixed instructor dependency,
added llama-index to be optional

Refactor COG-337

* fix: Resolve issue with llama-index being mandatory

Resolve issue with llama-index being mandatory to run cognee

Fix COG-337

* fix: Add install of llama-index to notebook

Removed additional references to llama-index from core cognee lib.
Added llama-index-core install from notebook

Fix COG-337

---------
2024-11-17 11:47:08 +01:00
Boris Arzentar
7c015e525d fix: cognee_demo notebook search 2024-11-12 09:01:03 +01:00
Boris Arzentar
da4d9c2c3b fix: change entity collection name 2024-11-12 09:01:03 +01:00
Boris Arzentar
c0d1aa1216 fix: update entities collection name in cognee_demo notebook 2024-11-12 09:01:03 +01:00
Leon Luithlen
9fe1b6c5fa Add code_graph_demo notebook 2024-11-12 09:01:03 +01:00
Boris Arzentar
a2b1087c84 feat: add FalkorDB integration 2024-11-12 09:01:01 +01:00
Boris
52180eb6b5
feat: COG-184 add falkordb (#192)
* feat: add falkordb adapter

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-11-11 18:20:52 +01:00
Igor Ilic
23ed38d615 test: Fix intentional typo in notebook
Removed typo used for testing notebook github action

Test #COG-462
2024-10-29 14:24:17 +01:00
Igor Ilic
2ba57220d8 test: Add typo in notebook to test github action
Added typo in notebook to test if github action will catch the issue

Test #COG-462
2024-10-29 14:20:47 +01:00
Igor Ilic
c183742ad5 test: Add test for Jupyter notebook
Added testing of Jupyter notebook through github actions

Test #COG-462
2024-10-29 13:47:23 +01:00
Igor Ilic
6555f4e88e fix: Resolve chunking issue for notebook
Add cleaning of local data to resolve chunking issue with repeated notebook use

Fix
2024-10-27 22:33:20 +01:00
Boris
2f832b190c
fix: various fixes for the deployment
* fix: remove groups from UserRead model

* fix: add missing system dependencies for postgres

* fix: change vector db provider environment variable name

* fix: WeaviateAdapter retrieve bug

* fix: correctly return data point objects from retrieve method

* fix: align graph object properties

* feat: add node example
2024-10-22 11:26:48 +02:00
Igor Ilic
658b6df4c6 refactor: Remove architecture overview
Removed architecture overview from notebook for now

Refactor #COG-387
2024-10-11 17:57:51 +02:00