Commit graph

152 commits

Author SHA1 Message Date
Boris Arzentar
27416afed0 fix: lancedb batch merge 2024-12-03 21:13:50 +01:00
Boris Arzentar
e07364fc25 Merge remote-tracking branch 'origin/main' into code-graph 2024-12-03 12:44:57 +01:00
Igor Ilic
04960eeb4e Merge branch 'main' of github.com:topoteretes/cognee-private into COG-502-backend-error-handling 2024-12-02 13:12:20 +01:00
Boris Arzentar
11acabdb6a fix: remove duplicate nodes and edges before saving; Fix FalkorDB vector index; 2024-12-02 10:10:18 +01:00
hajdul88
198f71b9be
feat: Implements multiprocessing for get_repo_file_dependencies task (#43) 2024-12-01 11:51:04 +01:00
Igor Ilic
6b97e95e14 refactor: Split entity related exceptions into graph and database exceptions
Move and split database entity related exceptions into graph and database exceptions

Refactor COG-502
2024-11-29 17:40:48 +01:00
Igor Ilic
eb09e5ad89 refactor: Moved ingestion exceptions to ingestion module
Moved custom ingestion exceptions to ingestion module

Refactor COG-502
2024-11-29 17:15:54 +01:00
Leon Luithlen
bc82430fb5 Merge latest COG-519 2024-11-29 14:36:03 +01:00
Igor Ilic
56367cb0c3 feat: Add Dlt support for Sqlite
Added support for using sqlite with dlt

Feature COG-678
2024-11-28 16:50:30 +01:00
Igor Ilic
9bd3011264 feat: Make relational databases work as singleton
Moved dlt pipeline to run in it's own fuction so it doesn't use get_relational_database.
Dlt has it's own async event loop and object can't be shared between event loops

Feature COG-678
2024-11-28 12:59:04 +01:00
Leon Luithlen
d4e77636b5 Revert spaces around args 2024-11-28 09:18:49 +01:00
Leon Luithlen
15802237e9 Get metadata from metadata table 2024-11-28 09:18:49 +01:00
Leon Luithlen
cd0e505ac0 WIP 2024-11-28 09:18:49 +01:00
Leon Luithlen
1679c746a3 Move class and functions to data.models 2024-11-28 09:18:49 +01:00
Leon Luithlen
9e93ea0794 Make save_data_item_with_metadata_to_storage async 2024-11-28 09:18:49 +01:00
Leon Luithlen
5b5c1ea5c6 Fix module import error 2024-11-28 09:18:49 +01:00
Leon Luithlen
7324564655 Add metadata_id attribute to Document and DocumentChunk, make ingest_with_metadata default 2024-11-28 09:18:49 +01:00
Leon Luithlen
fd987ed61e Add autoformatting 2024-11-28 09:18:49 +01:00
Leon Luithlen
c5f3314c85 Add Metadata table and read write delete functions 2024-11-28 09:18:49 +01:00
Boris Arzentar
2408fd7a01 fix: falkordb adapter errors 2024-11-28 09:12:37 +01:00
Boris
6403d15a76
fix: enable falkordb and add test for it (#31) 2024-11-27 22:55:30 +01:00
Igor Ilic
6eecc39db0 feat: Add custom exceptions to more cognee-lib modules
Added custom exceptions to more modules

Feature COG-502
2024-11-27 14:53:09 +01:00
Igor Ilic
ae568409a7 feat: Add custom exceptions to cognee lib
Added use of custom exceptions to cognee lib
2024-11-27 14:29:33 +01:00
Boris
64b8aac86f
feat: code graph swe integration
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: hande-k <handekafkas7@gmail.com>
Co-authored-by: Igor Ilic <igorilic03@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2024-11-27 09:32:29 +01:00
0xideas
0fb47ba23d
feat: COG-548-create-code-graph-to-kg-task (#7)
Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2024-11-24 20:50:32 +01:00
0xideas
80b06c3acb
test: Test for code graph enrichment task
Co-authored-by: lxobr <lazar@topoteretes.com>
2024-11-24 19:24:47 +01:00
Vasilije
9d6081c7f7
feat: Add support for multiple audio and image formats (#12)
Added support for multiple audio and image formats with example

The formats added are the possible filetype library return values for
extension for Audio and Images

Feature COG-507
2024-11-23 16:31:55 +01:00
lxobr
7ec5cffd8e
feat: Cog-693 expand dependency graph
Expand each file node into a subgraph containing high-level code parts

- Implemented `extract_code_parts` to parse and extract high-level
components (classes, functions, imports, and top-level code) from Python
source files using `parso`.
- Developed `expand_dependency_graph` to expand Python file nodes into
their components.
- Included a checker script
2024-11-23 14:02:21 +01:00
lxobr
a8aefd57ef
COG-546 get_local_script_dependencies (#6)
A utility function, `get_local_script_dependencies`:

- Extracts and resolves local dependencies of a Python script using
`jedi` and `parso`.
- Returns a sorted list of unique module paths
- Optionally dependencies outside a specified repository path are
filtered out
- Includes an example/checker in `cognee/tasks/code`.

Will be used for creating a graph from a repo.
2024-11-20 16:36:03 +01:00
Igor Ilic
15b7b8ef2b fix: Resolve issue with table names in SQL commands
Some SQL commands require lowercase characters in table names unless table name is wrapped in quotes. Renamed all new tables to use lowercase

Fix COG-677
2024-11-20 14:54:35 +01:00
Igor Ilic
57783a979a feat: Add support for multiple audio and image formats
Added support for multiple audio and image formats with example

Feature COG-507
2024-11-20 14:03:14 +01:00
lxobr
f27dc0c91a fix: Rename, extract checker into a separate script 2024-11-20 12:28:10 +01:00
lxobr
263ecb9149 fix: Add input validation and error handling for paths 2024-11-20 12:28:10 +01:00
lxobr
8bc26bba97 fix: Add error handling for path conversion 2024-11-20 12:28:10 +01:00
lxobr
ebb811af87 fix: Filter out None values in module paths 2024-11-20 12:28:10 +01:00
lxobr
2417d18607 fix: Add logging instead of print 2024-11-20 12:28:10 +01:00
lxobr
1a1452e177 fix: Add error handling for Jedi analysis, with debug mode 2024-11-20 12:28:10 +01:00
lxobr
3aadda9a89 feat: Add argparse for testing purposes 2024-11-20 12:28:10 +01:00
lxobr
4bf2281cd5 feat: Enable async processing 2024-11-20 12:28:10 +01:00
lxobr
742792b6c1 refactor: Remove a comment 2024-11-20 12:28:10 +01:00
lxobr
2be2b802c0 feat: Safely handle file read errors 2024-11-20 12:28:10 +01:00
lxobr
e148d32c14 refactor: Modify sys.path in context manager 2024-11-20 12:28:10 +01:00
lxobr
ba83d71269 feat: extract script dependencies 2024-11-20 12:28:10 +01:00
lxobr
26e2dc852d feat: new repo-to-graph task 2024-11-20 12:28:10 +01:00
Boris
22a0e43d4a
Merge branch 'main' into COG-417-chunking-unit-tests 2024-11-17 13:40:32 +01:00
Igor Ilic
d30adb53f3
Cog 337 llama index support (#186)
* feat: Add support for LlamaIndex Document type

Added support for LlamaIndex Document type

Feature #COG-337

* docs: Add Jupyer Notebook for cognee with llama index document type

Added jupyter notebook which demonstrates cognee with LlamaIndex document type usage

Docs #COG-337

* feat: Add metadata migration from LlamaIndex document type

Allow usage of metadata from LlamaIndex documents

Feature #COG-337

* refactor: Change llama index migration function name

Change name of llama index function

Refactor #COG-337

* chore: Add llama index core dependency

Downgrade needed on tenacity and instructor modules to support llama index

Chore #COG-337

* Feature: Add ingest_data_with_metadata task

Added task that will have access to metadata if data is provided from different data ingestion tools

Feature #COG-337

* docs: Add description on why specific type checking is done

Explained why specific type checking is used instead of isinstance, as isinstace returns True for child classes as well

Docs #COG-337

* fix: Add missing parameter to function call

Added missing parameter to function call

Fix #COG-337

* refactor: Move storing of data from async to sync function

Moved data storing from async to sync

Refactor #COG-337

* refactor: Pretend ingest_data was changes instead of having two tasks

Refactor so ingest_data file was modified instead of having two ingest tasks

Refactor #COG-337

* refactor: Use old name for data ingestion with metadata

Merged new and old data ingestion tasks into one

Refactor #COG-337

* refactor: Return ingest_data and save_data_to_storage Tasks

Returned ingest_data and save_data_to_storage tasks

Refactor #COG-337

* refactor: Return previous ingestion Tasks to add function

Returned previous ignestion tasks to add function

Refactor #COG-337

* fix: Remove dict and use string for search query

Remove dictionary and use string for query in notebook and simple example

Fix COG-337

* refactor: Add changes request in pull request

Added the following changes that were requested in pull request:

Added synchronize label,
Made uniform syntax in if statement in workflow,
fixed instructor dependency,
added llama-index to be optional

Refactor COG-337

* fix: Resolve issue with llama-index being mandatory

Resolve issue with llama-index being mandatory to run cognee

Fix COG-337

* fix: Add install of llama-index to notebook

Removed additional references to llama-index from core cognee lib.
Added llama-index-core install from notebook

Fix COG-337

---------
2024-11-17 11:47:08 +01:00
Leon Luithlen
e40e7386a0 Refactor word_type yielding in chuck_by_sentence 2024-11-14 17:16:04 +01:00
Leon Luithlen
84c98f16bb Remove chunk_index attribute from chunk_by_sentence return value 2024-11-14 16:49:13 +01:00
Leon Luithlen
15420dd864 Fix paragraph_ids handling 2024-11-14 16:47:51 +01:00
Leon Luithlen
d6a6a9eaba Return sentence_cut instead of word in chunk_by_paragraph 2024-11-14 15:03:09 +01:00