Commit graph

44 commits

Author SHA1 Message Date
Igor Ilic
5fe7ff9883
refactor: Refactor search so graph completion is used by default (#505)
<!-- .github/pull_request_template.md -->

## Description
Refactor search so query type doesn't need to be provided to make it
simpler for new users

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Improved the search interface by standardizing parameter usage with
explicit keyword arguments for specifying search types, enhancing
clarity and consistency.
- **Tests**
- Updated test cases and example integrations to align with the revised
search parameters, ensuring consistent behavior and reliable validation
of search outcomes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-07 17:16:34 +01:00
hajdul88
08c22a542a fix: fixes typo in multimedia example 2025-01-17 09:31:48 +01:00
hajdul88
981f35c1e0 fix: fixes windows compatibility in examples 2025-01-17 09:28:10 +01:00
hajdul88
bd6aafe9b7 fix: fixes event loop handling on windows in dynamic steps example 2025-01-16 18:17:11 +01:00
hajdul88
1db44de7de feat: adds graphiti demo notebook 2025-01-15 11:45:06 +01:00
hajdul88
d0646a1694 feat: Implements generation and retrieval and adjusts imports 2025-01-14 13:59:27 +01:00
hajdul88
c351047c36 feat: adds cognee node and edge embeddings for graphiti graph 2025-01-13 17:22:59 +01:00
hajdul88
ea8628c527 Fix: Fixes logging setup 2025-01-13 09:49:56 +01:00
Rita Aleksziev
a11b914f39 Merge branch 'dev' into COG-949 2025-01-10 10:02:56 +01:00
hajdul88
341f30fcdc fix: Fixes ruff formatting 2025-01-09 12:00:49 +01:00
hajdul88
fe57eb69e7
Merge branch 'dev' into feature/cog-967-adding-graph-completion-feature-to-cognee 2025-01-09 11:07:19 +01:00
Rita Aleksziev
5635da6e38 Adjust unit tests 2025-01-09 10:53:03 +01:00
hajdul88
d39140f28b feat: implements the first version of graph based completion in search 2025-01-08 16:10:29 +01:00
vasilije
41b1486cff Fix visualization 2025-01-08 13:13:52 +01:00
Rita Aleksziev
f4397bf940 Remove setting envvars from arg 2025-01-08 12:33:14 +01:00
Rita Aleksziev
8ffef5034a Add clean logging to code graph example 2025-01-08 12:25:31 +01:00
hajdul88
18c8bc3c33
Merge branch 'dev' into COG-adding_html_graph_render 2025-01-08 10:44:11 +01:00
alekszievr
0dec704445
Merge branch 'dev' into COG-949 2025-01-08 10:21:07 +01:00
hajdul88
58da2d9e57 fix: Fixes faulty logging format and sets up error logging in dynamic steps example 2025-01-07 11:01:37 +01:00
lxobr
5e79dc53c5 feat: time code graph run and add mock support 2025-01-06 11:25:04 +01:00
vasilije
60c8fd103b ruff format 2025-01-05 19:09:08 +01:00
lxobr
262deee26e
Cog 813 source code chunks (#383)
* fix: pass the list of all CodeFiles to enrichment task

* feat: introduce SourceCodeChunk, update metadata

* feat: get_source_code_chunks code graph pipeline task

* feat: integrate get_source_code_chunks task, comment out summarize_code

* Fix code summarization (#387)

* feat: update data models

* feat: naive parse long strings in source code

* fix: get_non_py_files instead of get_non_code_files

* fix: limit recursion, add comment

* handle embedding empty input error (#398)

* feat: robustly handle CodeFile source code

* refactor: sort imports

* todo: add support for other embedding models

* feat: add custom logger

* feat: add robustness to get_source_code_chunks

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* feat: improve embedding exceptions

* refactor: format indents, rename module

---------

Co-authored-by: alekszievr <44192193+alekszievr@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2024-12-26 13:53:38 +01:00
alekszievr
de2394c392
Ingest non-code files (#395)
* Ingest non-code files

* Fixing review findings
2024-12-20 14:06:40 +01:00
lxobr
da5e3ab24d
COG 870 Remove duplicate edges from the code graph (#293)
* feat: turn summarize_code into generator

* feat: extract run_code_graph_pipeline, update the pipeline

* feat: minimal code graph example

* refactor: update argument

* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline

* refactor: indentation and whitespace nits

* refactor: add deprecated use comments and warnings

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2024-12-17 12:02:25 +01:00
hajdul88
68c3f42ab8
Merge branch 'main' into feature/cog-717-create-edge-embeddings-in-vector-databases 2024-12-05 09:08:37 +01:00
Rita Aleksziev
dd94781033 Integrate graphiti's functionality as Tasks 2024-12-04 16:33:26 +01:00
hajdul88
46ee513f6c chore: deletes comment from dynamic_steps_example 2024-12-04 14:59:01 +01:00
hajdul88
59f8ec665f Merge remote-tracking branch 'origin/main' into feature/cog-537-implement-retrieval-algorithm-from-research-paper 2024-11-26 16:38:32 +01:00
hajdul88
db07179856 chore: Adds error handling to brute force triplet search 2024-11-26 16:17:57 +01:00
hajdul88
c66c43e717 chore: places retrievers under modules directory 2024-11-26 15:44:11 +01:00
hajdul88
a59517409c chore: Fixes some of the issues based on PR review + restructures things 2024-11-26 14:45:48 +01:00
Vasilije
9d6081c7f7
feat: Add support for multiple audio and image formats (#12)
Added support for multiple audio and image formats with example

The formats added are the possible filetype library return values for
extension for Audio and Images

Feature COG-507
2024-11-23 16:31:55 +01:00
hande-k
157d7d217d docs: added cognify steps in the print statement and commented example output 2024-11-21 13:57:42 +01:00
hajdul88
b5d9e7a6d2 chore: adds return value and sets tue entry point kg generation to true 2024-11-20 19:03:32 +01:00
hajdul88
a114d68aef feat: Implements basic global triplet optimizing retrieval 2024-11-20 18:33:34 +01:00
Igor Ilic
57783a979a feat: Add support for multiple audio and image formats
Added support for multiple audio and image formats with example

Feature COG-507
2024-11-20 14:03:14 +01:00
hande-k
c6e447f28c docs: add print statements to the simple example, update README 2024-11-20 08:47:02 +01:00
hajdul88
c4850f64dc feat: Implements pipeline structure for retrievers 2024-11-19 11:14:42 +01:00
Rita Aleksziev
07b1956b6e Fix syntax in simple example 2024-11-19 09:55:21 +01:00
Boris
c045f737f7
feat: add vector and graph dbs state to README file (#235) 2024-11-18 17:51:41 +01:00
Igor Ilic
d30adb53f3
Cog 337 llama index support (#186)
* feat: Add support for LlamaIndex Document type

Added support for LlamaIndex Document type

Feature #COG-337

* docs: Add Jupyer Notebook for cognee with llama index document type

Added jupyter notebook which demonstrates cognee with LlamaIndex document type usage

Docs #COG-337

* feat: Add metadata migration from LlamaIndex document type

Allow usage of metadata from LlamaIndex documents

Feature #COG-337

* refactor: Change llama index migration function name

Change name of llama index function

Refactor #COG-337

* chore: Add llama index core dependency

Downgrade needed on tenacity and instructor modules to support llama index

Chore #COG-337

* Feature: Add ingest_data_with_metadata task

Added task that will have access to metadata if data is provided from different data ingestion tools

Feature #COG-337

* docs: Add description on why specific type checking is done

Explained why specific type checking is used instead of isinstance, as isinstace returns True for child classes as well

Docs #COG-337

* fix: Add missing parameter to function call

Added missing parameter to function call

Fix #COG-337

* refactor: Move storing of data from async to sync function

Moved data storing from async to sync

Refactor #COG-337

* refactor: Pretend ingest_data was changes instead of having two tasks

Refactor so ingest_data file was modified instead of having two ingest tasks

Refactor #COG-337

* refactor: Use old name for data ingestion with metadata

Merged new and old data ingestion tasks into one

Refactor #COG-337

* refactor: Return ingest_data and save_data_to_storage Tasks

Returned ingest_data and save_data_to_storage tasks

Refactor #COG-337

* refactor: Return previous ingestion Tasks to add function

Returned previous ignestion tasks to add function

Refactor #COG-337

* fix: Remove dict and use string for search query

Remove dictionary and use string for query in notebook and simple example

Fix COG-337

* refactor: Add changes request in pull request

Added the following changes that were requested in pull request:

Added synchronize label,
Made uniform syntax in if statement in workflow,
fixed instructor dependency,
added llama-index to be optional

Refactor COG-337

* fix: Resolve issue with llama-index being mandatory

Resolve issue with llama-index being mandatory to run cognee

Fix COG-337

* fix: Add install of llama-index to notebook

Removed additional references to llama-index from core cognee lib.
Added llama-index-core install from notebook

Fix COG-337

---------
2024-11-17 11:47:08 +01:00
hajdul88
32504255ef feat: Adds unit tests to CogneeGraph class 2024-11-14 11:46:17 +01:00
hajdul88
38d29ee0c9 Adds an entrypoint to enable/disable individual steps 2024-11-11 18:35:18 +01:00
lxobr
17d4aca538 feat: add simple python example 2024-11-05 10:05:14 +01:00