vasilije
60c8fd103b
ruff format
2025-01-05 19:09:08 +01:00
Igor Ilic
a4fe33ce92
Merge branch 'dev' into COG-475-local-file-endpoint-deletion
2024-12-20 15:25:10 +01:00
alekszievr
291f1c5a55
Handle retryerrors in code summary ( #396 )
...
* Handle retryerrors in code summary
* Log instead of print
2024-12-20 15:21:10 +01:00
Igor Ilic
6cb7fef411
Merge branch 'dev' into COG-475-local-file-endpoint-deletion
2024-12-19 17:34:42 +01:00
Igor Ilic
c139d52938
feat: Add deletion of local files made by cognee through data endpoint
...
Delete local files made by cognee when deleting data from database through endpoint
Feature COG-475
2024-12-19 16:35:35 +01:00
hajdul88
4689e55e68
feat: Adds mock summary for codegraph pipeline
2024-12-18 16:42:48 +01:00
Igor Ilic
f6800b979e
feat: Add deletion of local files when deleting data
...
Delete local files when deleting data from cognee
Feature COG-475
2024-12-18 15:26:13 +01:00
Igor Ilic
48825d0d84
chore: Resolve typo in getting documents code
...
Resolve typo in code
chore COG-912
2024-12-17 14:22:51 +01:00
Igor Ilic
8b09358552
Merge branch 'dev' into COG-912-search-by-dataset
2024-12-17 13:22:13 +01:00
alekszievr
9afd0ece63
Structured code summarization ( #375 )
...
* feat: turn summarize_code into generator
* feat: extract run_code_graph_pipeline, update the pipeline
* feat: minimal code graph example
* refactor: update argument
* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline
* refactor: indentation and whitespace nits
* refactor: add deprecated use comments and warnings
* Structured code summarization
* add missing prompt file
* Remove summarization_model argument from summarize_code and fix typehinting
* minor refactors
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2024-12-17 13:05:47 +01:00
Igor Ilic
af335fafe3
test: Added test for getting of documents for search
...
Added test to verify getting documents related to datasets intended for search
Test COG-912
2024-12-17 12:11:24 +01:00
hajdul88
9e7ab6492a
feat: outsources chunking parameters to extract chunk from documents … ( #289 )
...
* feat: outsources chunking parameters to extract chunk from documents task
2024-12-17 11:31:31 +01:00
Igor Ilic
630ab556db
feat: Add search by dataset for cognee
...
Added ability to search by datasets for cognee users
Feature COG-912
2024-12-17 11:20:22 +01:00
alekszievr
bfa0f06fb4
Add type to DataPoint metadata ( #364 )
...
* Add type to DataPoint metadata
* Add missing index_fields
* Use DataPoint UUID type in pgvector create_data_points
* Make _metadata mandatory everywhere
2024-12-16 16:27:03 +01:00
Igor Ilic
35b1f7d26a
chore: Update typo in code
...
Update typo in string in code
Chore COG-656
2024-12-13 17:08:05 +01:00
Igor Ilic
11634cb58d
feat: Add unauth access error to getting data
...
Raise unauth access error when trying to read data without access
Feature COG-656
2024-12-13 16:54:53 +01:00
Igor Ilic
43187e4d63
feat: Add user verification for accessing data
...
Verify user has access to data before returning it
Feature COG-656
2024-12-13 13:54:45 +01:00
Igor Ilic
b8ba436dba
fix: Resolve issue with adding permissions to groups
...
Resolve issue with adding permissions to groups
Fix COG-656
2024-12-13 12:37:01 +01:00
Igor Ilic
eddfc17861
fix: Rewrite endpoint to add users to groups
...
Rewrote endpoint which adds users to groups
Fix COG-656
2024-12-13 12:13:42 +01:00
Igor Ilic
d4e2eb717a
fix: fix existing edge check
...
Resolve issue with UUID concat by casting to string
Fix COG-656
2024-12-11 16:04:31 +01:00
hajdul88
6d85165189
Feature/cog 539 implementing additional retriever approaches ( #262 )
...
* fix: refactor get_graph_from_model to return nodes and edges correctly
* fix: add missing params
* fix: remove complex zip usage
* fix: add edges to data_point properties
* fix: handle rate limit error coming from llm model
* fix: fixes lost edges and nodes in get_graph_from_model
* fix: fixes database pruning issue in pgvector
* fix: fixes database pruning issue in pgvector (#261 )
* feat: adds code summary embeddings to vector DB
* fix: cognee_demo notebook pipeline is not saving summaries
* feat: implements first version of codegraph retriever
* chore: implements minor changes mostly to make the code production ready
* fix: turns off raising duplicated edges unit test as we have these in our current codegraph generation
* feat: implements unit tests for description to codepart search
* fix: fixes edge property inconsistent access in codepart retriever
* chore: implements more precise typing for get_attribute method for cogneegraph
* chore: adds spacing to tests and changes the cogneegraph getter names
---------
Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2024-12-10 11:07:06 +01:00
Igor Ilic
344865f1a4
Merge branch 'main' into COG-685-more-document-types
2024-12-09 10:22:26 +01:00
Igor Ilic
07d9330e4a
feat: Add UnstructuredLibraryImportError
...
Added exception when unstructured libary is called but not installed
Feature COG-685
2024-12-08 14:53:19 +01:00
Igor Ilic
62db3f8598
feat: Remove the need for libmagic for unstructured documents
...
Remove the need for libmagic so for unstructured documents by providing mime_type information
Feature COG-685
2024-12-08 14:37:50 +01:00
Igor Ilic
78214456a6
feat: Add unstructured document handler
...
Added unstructured library and handling of certain document types through their library
Feature COG-685
2024-12-06 17:50:22 +01:00
alekszievr
f30bf35f92
Merge branch 'main' into feat/COG-418-log-config-to-telemetry
2024-12-06 16:11:56 +01:00
alekszievr
e6def6423c
Merge branch 'main' into feat/COG-418-log-config-to-telemetry
2024-12-06 13:58:38 +01:00
Igor Ilic
d7fa9f3cfd
Merge branch 'COG-505-data-dataset-model-changes' of github.com:topoteretes/cognee into COG-505-data-dataset-model-changes
2024-12-06 13:49:07 +01:00
Igor Ilic
cc6fbe2a5f
refactor: Add space to ingest function
...
Add space and newline to ingest function
Refactor COG-505
2024-12-06 13:48:39 +01:00
Rita Aleksziev
462fcef240
move config getter into cognee/modules/pipelines/operations/run_tasks.py and make the indentation a bit more readable
2024-12-06 13:38:54 +01:00
Rita Aleksziev
dbfa91b635
Add cognee config to telemetry
2024-12-06 12:55:25 +01:00
Boris
9429e5e1f5
Merge branch 'main' into COG-505-data-dataset-model-changes
2024-12-06 12:53:32 +01:00
Boris
348610e73c
fix: refactor get_graph_from_model to return nodes and edges correctly ( #257 )
...
* fix: handle rate limit error coming from llm model
* fix: fixes lost edges and nodes in get_graph_from_model
* fix: fixes database pruning issue in pgvector (#261 )
* fix: cognee_demo notebook pipeline is not saving summaries
---------
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-12-06 12:52:01 +01:00
Igor Ilic
349ddfe794
Merge branch 'main' into COG-505-data-dataset-model-changes
2024-12-05 17:10:43 +01:00
Igor Ilic
f5b5e56cc1
feat: Add deduplication of data
...
Data is deduplicated per user so if a user tries to add data which already exists it will just be redirected to existing data in database
Feature COG-505
2024-12-05 16:38:44 +01:00
Igor Ilic
0ce254b262
feat: Add text deduplication
...
If text is added to cognee it will be saved by hash so the same text can't be stored multiple times
Feature COG-505
2024-12-04 17:19:29 +01:00
hajdul88
c20ee11e80
feat: implements graph edge indexing
2024-12-04 15:37:48 +01:00
Igor Ilic
0a0b030df5
fix: Resolve issue when metadata is updated
...
Resolve issue when attempting to update metadata related to data
Fix
2024-12-04 14:03:01 +01:00
Boris Arzentar
4678aaef52
Merge remote-tracking branch 'origin/main'
2024-12-04 11:16:16 +01:00
Boris Arzentar
27416afed0
fix: lancedb batch merge
2024-12-03 21:13:50 +01:00
Boris Arzentar
e07364fc25
Merge remote-tracking branch 'origin/main' into code-graph
2024-12-03 12:44:57 +01:00
hajdul88
6841c83566
fix: fixes cognify duplicated edges and resets the methods to an older version
2024-12-02 20:18:55 +01:00
Igor Ilic
04960eeb4e
Merge branch 'main' of github.com:topoteretes/cognee-private into COG-502-backend-error-handling
2024-12-02 13:12:20 +01:00
Boris Arzentar
76e2b6a639
Merge remote-tracking branch 'origin/main'
2024-12-02 10:15:30 +01:00
Boris Arzentar
11acabdb6a
fix: remove duplicate nodes and edges before saving; Fix FalkorDB vector index;
2024-12-02 10:10:18 +01:00
Boris Arzentar
d6f0d65b63
Merge remote-tracking branch 'origin/code-graph'
2024-12-01 11:51:54 +01:00
Boris Arzentar
e8a1ce531a
Merge remote-tracking branch 'origin/main'
2024-12-01 11:44:07 +01:00
Vasilije
bbaf78f54e
Cog 669 implement dummy llm adapter ( #37 )
...
Adds the `class DummyLLMAdapter(LLMInterface)` class for profiling of
large datasets without actual LLM calls in the top level
`profiling/util` location.
I also move the `show_prompt` implementation from the child classes to
`LLMInterface`, since the implementations were identical.
I expanded the scope to also include a DummyEmbeddingEngine.
2024-11-30 17:02:49 +01:00
Vasilije
4d02560f1c
Cog 519 develop metadata storage integration ( #35 )
...
@borisarzentar this PR is ready, the all checks run through in the
"sister" MR targeting main:
https://github.com/topoteretes/cognee-private/pull/26
2024-11-30 17:02:18 +01:00
Igor Ilic
6b97e95e14
refactor: Split entity related exceptions into graph and database exceptions
...
Move and split database entity related exceptions into graph and database exceptions
Refactor COG-502
2024-11-29 17:40:48 +01:00