Commit graph

665 commits

Author SHA1 Message Date
lxobr
8bc26bba97 fix: Add error handling for path conversion 2024-11-20 12:28:10 +01:00
lxobr
ebb811af87 fix: Filter out None values in module paths 2024-11-20 12:28:10 +01:00
lxobr
2417d18607 fix: Add logging instead of print 2024-11-20 12:28:10 +01:00
lxobr
1a1452e177 fix: Add error handling for Jedi analysis, with debug mode 2024-11-20 12:28:10 +01:00
lxobr
3aadda9a89 feat: Add argparse for testing purposes 2024-11-20 12:28:10 +01:00
lxobr
4bf2281cd5 feat: Enable async processing 2024-11-20 12:28:10 +01:00
lxobr
742792b6c1 refactor: Remove a comment 2024-11-20 12:28:10 +01:00
lxobr
2be2b802c0 feat: Safely handle file read errors 2024-11-20 12:28:10 +01:00
lxobr
e148d32c14 refactor: Modify sys.path in context manager 2024-11-20 12:28:10 +01:00
lxobr
ba83d71269 feat: extract script dependencies 2024-11-20 12:28:10 +01:00
lxobr
26e2dc852d feat: new repo-to-graph task 2024-11-20 12:28:10 +01:00
hajdul88
d9eec77f18 feat: Implements first step of the two step retrieval 2024-11-19 16:40:27 +01:00
hajdul88
44ac9b68b4 feat: adds get_distances from collection method to LanceDB and PgVector 2024-11-19 16:39:45 +01:00
Boris
ab1328d898
Merge branch 'main' into COG-533-pydantic-unit-tests 2024-11-19 15:39:31 +01:00
Igor Ilic
4b55354dce
fix: Resolve issue with pgvector timeout (#3)
By creating PGVector as a singleton all issues regrading timeout are
resolved as there are no more parallel instances trying to communicate
with the database
2024-11-19 15:31:26 +01:00
Boris
5f144a0f92
fix: make all checks green (#1) 2024-11-19 15:30:09 +01:00
Rita Aleksziev
2948089806 Read patch generation instructions from file 2024-11-19 14:07:53 +01:00
hajdul88
c4850f64dc feat: Implements pipeline structure for retrievers 2024-11-19 11:14:42 +01:00
Leon Luithlen
b18f748c9e Merge dicts directly 2024-11-19 10:56:21 +01:00
Boris
64bc425330
Merge branch 'main' into fix/rename-remaining-query-to-query-text-kwargs 2024-11-18 17:39:23 +01:00
0xideas
34e140a41d
Switch to gpt-4o-mini by default (#233)
* Switch to gpt-4o-mini by default

* Add option and make gpt-4o-mini default in frontend

* Run llama index notebook without extra arguments in poetry install

* Install extras for llama_index_notebook run
2024-11-18 17:38:54 +01:00
Leon Luithlen
7a2fc617a8 Rename remaining 'query' keyword args in cognee.search to 'query_text' 2024-11-18 14:00:14 +01:00
Leon Luithlen
fde56f0c3b Merge branch 'main' into COG-533-pydantic-unit-tests 2024-11-18 11:24:51 +01:00
Leon Luithlen
103eb13c77 Skip recursive pydantic tests for Python 3.9 and 3.10 2024-11-18 11:23:22 +01:00
Boris
22a0e43d4a
Merge branch 'main' into COG-417-chunking-unit-tests 2024-11-17 13:40:32 +01:00
Boris
d8b6eeded5
feat: log search queries and results (#166)
* feat: log search queries and results

* fix: address coderabbit review comments

* fix: parse UUID when logging search results

* fix: remove custom UUID type and use DB agnostic UUID from sqlalchemy

* Add new cognee_db

---------

Co-authored-by: Leon Luithlen <leon@topoteretes.com>
2024-11-17 11:59:10 +01:00
Igor Ilic
d30adb53f3
Cog 337 llama index support (#186)
* feat: Add support for LlamaIndex Document type

Added support for LlamaIndex Document type

Feature #COG-337

* docs: Add Jupyer Notebook for cognee with llama index document type

Added jupyter notebook which demonstrates cognee with LlamaIndex document type usage

Docs #COG-337

* feat: Add metadata migration from LlamaIndex document type

Allow usage of metadata from LlamaIndex documents

Feature #COG-337

* refactor: Change llama index migration function name

Change name of llama index function

Refactor #COG-337

* chore: Add llama index core dependency

Downgrade needed on tenacity and instructor modules to support llama index

Chore #COG-337

* Feature: Add ingest_data_with_metadata task

Added task that will have access to metadata if data is provided from different data ingestion tools

Feature #COG-337

* docs: Add description on why specific type checking is done

Explained why specific type checking is used instead of isinstance, as isinstace returns True for child classes as well

Docs #COG-337

* fix: Add missing parameter to function call

Added missing parameter to function call

Fix #COG-337

* refactor: Move storing of data from async to sync function

Moved data storing from async to sync

Refactor #COG-337

* refactor: Pretend ingest_data was changes instead of having two tasks

Refactor so ingest_data file was modified instead of having two ingest tasks

Refactor #COG-337

* refactor: Use old name for data ingestion with metadata

Merged new and old data ingestion tasks into one

Refactor #COG-337

* refactor: Return ingest_data and save_data_to_storage Tasks

Returned ingest_data and save_data_to_storage tasks

Refactor #COG-337

* refactor: Return previous ingestion Tasks to add function

Returned previous ignestion tasks to add function

Refactor #COG-337

* fix: Remove dict and use string for search query

Remove dictionary and use string for query in notebook and simple example

Fix COG-337

* refactor: Add changes request in pull request

Added the following changes that were requested in pull request:

Added synchronize label,
Made uniform syntax in if statement in workflow,
fixed instructor dependency,
added llama-index to be optional

Refactor COG-337

* fix: Resolve issue with llama-index being mandatory

Resolve issue with llama-index being mandatory to run cognee

Fix COG-337

* fix: Add install of llama-index to notebook

Removed additional references to llama-index from core cognee lib.
Added llama-index-core install from notebook

Fix COG-337

---------
2024-11-17 11:47:08 +01:00
Vasilije
d1e9870972
Merge branch 'main' into COG-597-refactor-analytics 2024-11-16 13:49:30 +01:00
Leon Luithlen
8a2cf2075a Add model_rebuild 2024-11-15 17:57:03 +01:00
Leon Luithlen
a3342918d9 Apply cosmetic changes and autoformat 2024-11-15 16:53:32 +01:00
Leon Luithlen
a1f72727bc Revert model_rebuild order 2024-11-15 16:17:33 +01:00
Leon Luithlen
f3f0bca9bd Revert making Person attributes optional 2024-11-15 16:03:53 +01:00
Leon Luithlen
370b59b39a Add get_graph_from_model_generative_test 2024-11-15 15:58:03 +01:00
Leon Luithlen
5a464bfca7 Refactor get_model_instance_from_graph 2024-11-15 15:57:50 +01:00
Igor Ilic
2703215dec refactor: Add user_id to event properties
Adding user_id to event properties allows tracking of which user started the event

Refactor COG-597
2024-11-15 15:20:41 +01:00
Leon Luithlen
afae70f3b5 Add get_graph_from_model_generative_test 2024-11-15 15:10:42 +01:00
Igor Ilic
d90f5fe7c1 feat: Add proxy for analytics
Added proxy usage with vercel hosting for telemetry and analytics

Feature COG-597
2024-11-15 15:05:46 +01:00
Leon Luithlen
3c8a52f4b0 Fix inconsistent state between nodes and added_nodes and edges and added_edges 2024-11-15 14:47:36 +01:00
hajdul88
1df12c1259 fix: Fixes processing false Class keyword issue 2024-11-15 14:47:13 +01:00
Leon Luithlen
a5860700a7 Remove include_root parameter 2024-11-15 14:00:59 +01:00
Leon Luithlen
05ea357520 Refactor get_graph_from_model 2024-11-15 13:43:13 +01:00
Leon Luithlen
2c0fce32d3 WIP get_graph_from_model 2024-11-15 13:38:33 +01:00
Leon Luithlen
7be613e2fc WIP nested pydantic structures 2024-11-15 11:57:26 +01:00
Leon Luithlen
0ea011ccd7 Adapt graph interfaces tests to debugged get_graph_from_model 2024-11-15 10:27:27 +01:00
Leon Luithlen
628f192b8d Remove added_nodes and added_edges default dicts 2024-11-15 10:27:06 +01:00
Leon Luithlen
f51a44fd76 Remove unneeded document.read in AudioDocument_test 2024-11-14 17:18:36 +01:00
Leon Luithlen
e40e7386a0 Refactor word_type yielding in chuck_by_sentence 2024-11-14 17:16:04 +01:00
Leon Luithlen
14dd60576e Fix indexing in tests in chunk_by_sentence_test 2024-11-14 17:06:16 +01:00
Leon Luithlen
928e1075c6 Test chunk_by_paragraph chunk numbering 2024-11-14 16:55:24 +01:00
Leon Luithlen
84c98f16bb Remove chunk_index attribute from chunk_by_sentence return value 2024-11-14 16:49:13 +01:00