Commit graph

665 commits

Author SHA1 Message Date
lxobr
da5e3ab24d
COG 870 Remove duplicate edges from the code graph (#293)
* feat: turn summarize_code into generator

* feat: extract run_code_graph_pipeline, update the pipeline

* feat: minimal code graph example

* refactor: update argument

* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline

* refactor: indentation and whitespace nits

* refactor: add deprecated use comments and warnings

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2024-12-17 12:02:25 +01:00
hajdul88
9e7ab6492a
feat: outsources chunking parameters to extract chunk from documents … (#289)
* feat: outsources chunking parameters to extract chunk from documents task
2024-12-17 11:31:31 +01:00
alekszievr
bfa0f06fb4
Add type to DataPoint metadata (#364)
* Add type to DataPoint metadata

* Add missing index_fields

* Use DataPoint UUID type in pgvector create_data_points

* Make _metadata mandatory everywhere
2024-12-16 16:27:03 +01:00
lxobr
5360093097
COG-810 Implement a top-down dependency graph builder tool (#268)
* feat: parse repo to call graph

* Update/repo_processor/top_down_repo_parse.py task

* fix: minor improvements

* feat: file parsing jedi script optimisation

---------
2024-12-16 16:02:39 +01:00
Igor Ilic
34b139af26 Revert "fix: Add metadata reflection fix to sqlite as well"
This reverts commit 394a0b2dfb.
2024-12-16 13:19:21 +01:00
Igor Ilic
394a0b2dfb fix: Add metadata reflection fix to sqlite as well
Added fix when reflecting metadata to sqlite as well

Fix
2024-12-16 11:26:33 +01:00
Igor Ilic
d9e558e885 fix: Resolve reflection issue when running cognee a second time after pruning data
When running cognee a second time after pruning data some metadata doesn't get pruned.
This makes cognee believe some tables exist that have been deleted

Fix
2024-12-16 11:02:50 +01:00
Vasilije
fb1c223982
Merge pull request #369 from topoteretes/feat/pass_pydantic_model_to_cognify
pass pydantic model to cognify
2024-12-13 19:46:05 +01:00
Igor Ilic
35b1f7d26a chore: Update typo in code
Update typo in string in code

Chore COG-656
2024-12-13 17:08:05 +01:00
Igor Ilic
924759a599 refactor: Rename query compute to query completion
Rename searching type from compute to completion

Refactor COG-656
2024-12-13 17:03:38 +01:00
Boris
d437135684
Merge branch 'dev' into feat/pass_pydantic_model_to_cognify 2024-12-13 16:57:59 +01:00
Igor Ilic
11634cb58d feat: Add unauth access error to getting data
Raise unauth access error when trying to read data without access

Feature COG-656
2024-12-13 16:54:53 +01:00
Rita Aleksziev
1c9fe01f64 pass pydantic model tocognify 2024-12-13 16:34:48 +01:00
Igor Ilic
67585d0ab1 feat: Add simple instruction for system prompt
Add simple instruction for system prompt

Feature COG-656
2024-12-13 15:30:24 +01:00
Igor Ilic
9c3e2422f3 feat: Add compute search to cognee
Add compute search to cognee which makes searches human readable

Feature COG-656
2024-12-13 15:18:33 +01:00
Igor Ilic
43187e4d63 feat: Add user verification for accessing data
Verify user has access to data before returning it

Feature COG-656
2024-12-13 13:54:45 +01:00
Igor Ilic
1180839469 feat: Add error handling in case user is already part of database and permission already given to group
Added error handling in case permission is already given to group and user is already part of group

Feature COG-656
2024-12-13 12:49:57 +01:00
Igor Ilic
b8ba436dba fix: Resolve issue with adding permissions to groups
Resolve issue with adding permissions to groups

Fix COG-656
2024-12-13 12:37:01 +01:00
Igor Ilic
eddfc17861 fix: Rewrite endpoint to add users to groups
Rewrote endpoint which adds users to groups

Fix COG-656
2024-12-13 12:13:42 +01:00
Igor Ilic
92d0122b46 fix: Remove data handling based on type in resolving directory function
No need to handle different data types in resolving directories, focus on just handling case when it's a directory

Fix COG-656
2024-12-13 09:55:47 +01:00
Igor Ilic
7100a4994a feat: Add resolving of directories as task for the add pipeline
Add resolving of directories as task for the add pipeline

Feature COG-656
2024-12-12 17:04:49 +01:00
Igor Ilic
3a1229c357 Merge branch 'fix-pgvector-search' of github.com:topoteretes/cognee into COG-656-deployment-state 2024-12-12 13:56:47 +01:00
Igor Ilic
599e1d478b fix: Resolve issue regrading not having Vector column type defined when using vector search
Issue happens when search is called in a session without previously adding data or creating tables as an import of Vector column type was missing

Fix
2024-12-12 13:37:18 +01:00
Igor Ilic
9b4af85474 fix: Resolve issue with text being submitted as data
Add support for text data to resolving data directory task

Fix COG-656
2024-12-12 13:31:20 +01:00
Igor Ilic
92ecd8a024
fix: Resolve issue with UUID being concatinated instead of string (#358)
Resolve issue regarding UUID being concatenated instead of string
2024-12-12 11:02:03 +01:00
Igor Ilic
ff9fd90cf1 feat: Add directory resolution as step in cognee add function
Added directory resolution as step in cognee add function

Feature COG-656
2024-12-11 17:33:51 +01:00
Igor Ilic
d9d90d91ae chore: Remove comments from code
Remove code comments that are not needed

Chore COG-656
2024-12-11 16:49:34 +01:00
Igor Ilic
d4e2eb717a fix: fix existing edge check
Resolve issue with UUID concat by casting to string

Fix COG-656
2024-12-11 16:04:31 +01:00
Igor Ilic
f3ce7be885 feat: Add ability to send directories with data to cognee
Add ability to send data directories to cognee

Feature COG-656
2024-12-11 14:31:54 +01:00
hajdul88
6d85165189
Feature/cog 539 implementing additional retriever approaches (#262)
* fix: refactor get_graph_from_model to return nodes and edges correctly

* fix: add missing params

* fix: remove complex zip usage

* fix: add edges to data_point properties

* fix: handle rate limit error coming from llm model

* fix: fixes lost edges and nodes in get_graph_from_model

* fix: fixes database pruning issue in pgvector

* fix: fixes database pruning issue in pgvector (#261)

* feat: adds code summary embeddings to vector DB

* fix: cognee_demo notebook pipeline is not saving summaries

* feat: implements first version of codegraph retriever

* chore: implements minor changes mostly to make the code production ready

* fix: turns off raising duplicated edges unit test as we have these in our current codegraph generation

* feat: implements unit tests for description to codepart search

* fix: fixes edge property inconsistent access in codepart retriever

* chore: implements more precise typing for get_attribute method for cogneegraph

* chore: adds spacing to tests and changes the cogneegraph getter names

---------

Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2024-12-10 11:07:06 +01:00
Igor Ilic
acf5952b31 test: Update typo in unstructured test
Update typo for file name in test

Test COG-685
2024-12-09 16:37:46 +01:00
Igor Ilic
e0a3563249 Merge branch 'COG-685-more-document-types' of github.com:topoteretes/cognee into COG-685-more-document-types 2024-12-09 15:21:26 +01:00
Igor Ilic
d7d559f4f7 test: Add tests for different document types
Add tests for unstructured reading for different document types

Test COG-685
2024-12-09 15:20:50 +01:00
Igor Ilic
344865f1a4
Merge branch 'main' into COG-685-more-document-types 2024-12-09 10:22:26 +01:00
Igor Ilic
df289deb18 chore: Update dependencies to handle different document types
Update unstructured so it would install support for different document types

Chore COG-685
2024-12-09 09:49:26 +01:00
Igor Ilic
596b3edf72 test: Add test for Unstructured pptx document type
Added pptx example file and tested Unstructured pptx document type handling

Test COG-685
2024-12-08 15:18:42 +01:00
Igor Ilic
07d9330e4a feat: Add UnstructuredLibraryImportError
Added exception when unstructured libary is called but not installed

Feature COG-685
2024-12-08 14:53:19 +01:00
Igor Ilic
62db3f8598 feat: Remove the need for libmagic for unstructured documents
Remove the need for libmagic so for unstructured documents by providing mime_type information

Feature COG-685
2024-12-08 14:37:50 +01:00
Igor Ilic
78214456a6 feat: Add unstructured document handler
Added unstructured library and handling of certain document types through their library

Feature COG-685
2024-12-06 17:50:22 +01:00
alekszievr
f30bf35f92
Merge branch 'main' into feat/COG-418-log-config-to-telemetry 2024-12-06 16:11:56 +01:00
alekszievr
e6def6423c
Merge branch 'main' into feat/COG-418-log-config-to-telemetry 2024-12-06 13:58:38 +01:00
Igor Ilic
d7fa9f3cfd Merge branch 'COG-505-data-dataset-model-changes' of github.com:topoteretes/cognee into COG-505-data-dataset-model-changes 2024-12-06 13:49:07 +01:00
Igor Ilic
cc6fbe2a5f refactor: Add space to ingest function
Add space and newline to ingest function

Refactor COG-505
2024-12-06 13:48:39 +01:00
Rita Aleksziev
462fcef240 move config getter into cognee/modules/pipelines/operations/run_tasks.py and make the indentation a bit more readable 2024-12-06 13:38:54 +01:00
Rita Aleksziev
dbfa91b635 Add cognee config to telemetry 2024-12-06 12:55:25 +01:00
Boris
9429e5e1f5
Merge branch 'main' into COG-505-data-dataset-model-changes 2024-12-06 12:53:32 +01:00
Boris
348610e73c
fix: refactor get_graph_from_model to return nodes and edges correctly (#257)
* fix: handle rate limit error coming from llm model

* fix: fixes lost edges and nodes in get_graph_from_model

* fix: fixes database pruning issue in pgvector (#261)

* fix: cognee_demo notebook pipeline is not saving summaries

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-12-06 12:52:01 +01:00
Igor Ilic
1e098ae70d refactor: Add error handling to hash util
Added error handling to reading of file in hash util

Refactor COG-505
2024-12-05 20:54:55 +01:00
Igor Ilic
e80377b729 refactor: Move hash calculation of file to util
Moved hash calculation of file to shared utils, added better typing

Refactor COG-505
2024-12-05 20:33:30 +01:00
Igor Ilic
9ba5d49e69 test: Fix test for multimedia deduplication
Add missing function to get data from database to multimedia deduplication test

Test COG-505
2024-12-05 20:09:29 +01:00