Igor Ilic
ab8d95cc30
refactor: As neo4j can't support dictionaries, add foreign metadata as string
2025-01-20 17:28:14 +01:00
Igor Ilic
49ad292592
refactor: Reduce complexity of metadata handling
...
Have foreign metadata be a table column in data instead of it's own table to reduce complexity
Refactor COG-793
2025-01-20 16:39:05 +01:00
Rita Aleksziev
abb3ea6d21
Adjust integration tests
2025-01-09 11:31:16 +01:00
Rita Aleksziev
5635da6e38
Adjust unit tests
2025-01-09 10:53:03 +01:00
Rita Aleksziev
34a9267f41
Get embedding engine instead of passing it. Get it from vector engine instead of direct getter.
2025-01-08 13:23:17 +01:00
Rita Aleksziev
a774191ed3
Adjust AudioDocument and handle None token limit
2025-01-07 13:38:23 +01:00
alekszievr
fbf8fc93bf
Merge branch 'dev' into COG-949
2025-01-07 13:01:16 +01:00
alekszievr
4802567871
Overcome ContextWindowExceededError by checking token count while chunking ( #413 )
2025-01-07 11:46:46 +01:00
lxobr
dbc33a6478
fix: adhere UnstructuredDocument.read() to Document
2025-01-06 11:23:55 +01:00
vasilije
76a0aa7e8b
Fix linter issues
2025-01-05 19:48:35 +01:00
vasilije
649fcf2ba8
Fix linter issues
2025-01-05 19:21:09 +01:00
vasilije
60c8fd103b
ruff format
2025-01-05 19:09:08 +01:00
Igor Ilic
a4fe33ce92
Merge branch 'dev' into COG-475-local-file-endpoint-deletion
2024-12-20 15:25:10 +01:00
alekszievr
291f1c5a55
Handle retryerrors in code summary ( #396 )
...
* Handle retryerrors in code summary
* Log instead of print
2024-12-20 15:21:10 +01:00
Igor Ilic
6cb7fef411
Merge branch 'dev' into COG-475-local-file-endpoint-deletion
2024-12-19 17:34:42 +01:00
Igor Ilic
c139d52938
feat: Add deletion of local files made by cognee through data endpoint
...
Delete local files made by cognee when deleting data from database through endpoint
Feature COG-475
2024-12-19 16:35:35 +01:00
hajdul88
4689e55e68
feat: Adds mock summary for codegraph pipeline
2024-12-18 16:42:48 +01:00
Igor Ilic
f6800b979e
feat: Add deletion of local files when deleting data
...
Delete local files when deleting data from cognee
Feature COG-475
2024-12-18 15:26:13 +01:00
Igor Ilic
8b09358552
Merge branch 'dev' into COG-912-search-by-dataset
2024-12-17 13:22:13 +01:00
alekszievr
9afd0ece63
Structured code summarization ( #375 )
...
* feat: turn summarize_code into generator
* feat: extract run_code_graph_pipeline, update the pipeline
* feat: minimal code graph example
* refactor: update argument
* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline
* refactor: indentation and whitespace nits
* refactor: add deprecated use comments and warnings
* Structured code summarization
* add missing prompt file
* Remove summarization_model argument from summarize_code and fix typehinting
* minor refactors
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2024-12-17 13:05:47 +01:00
hajdul88
9e7ab6492a
feat: outsources chunking parameters to extract chunk from documents … ( #289 )
...
* feat: outsources chunking parameters to extract chunk from documents task
2024-12-17 11:31:31 +01:00
Igor Ilic
630ab556db
feat: Add search by dataset for cognee
...
Added ability to search by datasets for cognee users
Feature COG-912
2024-12-17 11:20:22 +01:00
alekszievr
bfa0f06fb4
Add type to DataPoint metadata ( #364 )
...
* Add type to DataPoint metadata
* Add missing index_fields
* Use DataPoint UUID type in pgvector create_data_points
* Make _metadata mandatory everywhere
2024-12-16 16:27:03 +01:00
Igor Ilic
35b1f7d26a
chore: Update typo in code
...
Update typo in string in code
Chore COG-656
2024-12-13 17:08:05 +01:00
Igor Ilic
11634cb58d
feat: Add unauth access error to getting data
...
Raise unauth access error when trying to read data without access
Feature COG-656
2024-12-13 16:54:53 +01:00
Igor Ilic
43187e4d63
feat: Add user verification for accessing data
...
Verify user has access to data before returning it
Feature COG-656
2024-12-13 13:54:45 +01:00
Igor Ilic
07d9330e4a
feat: Add UnstructuredLibraryImportError
...
Added exception when unstructured libary is called but not installed
Feature COG-685
2024-12-08 14:53:19 +01:00
Igor Ilic
62db3f8598
feat: Remove the need for libmagic for unstructured documents
...
Remove the need for libmagic so for unstructured documents by providing mime_type information
Feature COG-685
2024-12-08 14:37:50 +01:00
Igor Ilic
78214456a6
feat: Add unstructured document handler
...
Added unstructured library and handling of certain document types through their library
Feature COG-685
2024-12-06 17:50:22 +01:00
Igor Ilic
f5b5e56cc1
feat: Add deduplication of data
...
Data is deduplicated per user so if a user tries to add data which already exists it will just be redirected to existing data in database
Feature COG-505
2024-12-05 16:38:44 +01:00
Igor Ilic
0a0b030df5
fix: Resolve issue when metadata is updated
...
Resolve issue when attempting to update metadata related to data
Fix
2024-12-04 14:03:01 +01:00
Boris Arzentar
e07364fc25
Merge remote-tracking branch 'origin/main' into code-graph
2024-12-03 12:44:57 +01:00
Vasilije
bbaf78f54e
Cog 669 implement dummy llm adapter ( #37 )
...
Adds the `class DummyLLMAdapter(LLMInterface)` class for profiling of
large datasets without actual LLM calls in the top level
`profiling/util` location.
I also move the `show_prompt` implementation from the child classes to
`LLMInterface`, since the implementations were identical.
I expanded the scope to also include a DummyEmbeddingEngine.
2024-11-30 17:02:49 +01:00
Leon Luithlen
bc82430fb5
Merge latest COG-519
2024-11-29 14:36:03 +01:00
Leon Luithlen
a2ff42332e
DummyLLMAdapter WIP
2024-11-28 11:49:28 +01:00
Leon Luithlen
15802237e9
Get metadata from metadata table
2024-11-28 09:18:49 +01:00
Leon Luithlen
cd0e505ac0
WIP
2024-11-28 09:18:49 +01:00
Leon Luithlen
1679c746a3
Move class and functions to data.models
2024-11-28 09:18:49 +01:00
Leon Luithlen
7324564655
Add metadata_id attribute to Document and DocumentChunk, make ingest_with_metadata default
2024-11-28 09:18:49 +01:00
Leon Luithlen
fd987ed61e
Add autoformatting
2024-11-28 09:18:49 +01:00
Leon Luithlen
c5f3314c85
Add Metadata table and read write delete functions
2024-11-28 09:18:49 +01:00
Igor Ilic
ae568409a7
feat: Add custom exceptions to cognee lib
...
Added use of custom exceptions to cognee lib
2024-11-27 14:29:33 +01:00
Boris
22a0e43d4a
Merge branch 'main' into COG-417-chunking-unit-tests
2024-11-17 13:40:32 +01:00
Boris
d8b6eeded5
feat: log search queries and results ( #166 )
...
* feat: log search queries and results
* fix: address coderabbit review comments
* fix: parse UUID when logging search results
* fix: remove custom UUID type and use DB agnostic UUID from sqlalchemy
* Add new cognee_db
---------
Co-authored-by: Leon Luithlen <leon@topoteretes.com>
2024-11-17 11:59:10 +01:00
Leon Luithlen
f87fd12e9b
Fix lambda bug in AudioDocument and ImageDocument
2024-11-14 12:41:47 +01:00
Leon Luithlen
8b3b2f8156
Add transcribe_image and create_transcript methods
2024-11-14 11:59:46 +01:00
Leon Luithlen
fbd011560a
Rebase onto main
2024-11-12 16:47:28 +01:00
Leon Luithlen
d7ffef1979
Remove old __tests__ folders
2024-11-12 16:47:28 +01:00
Leon Luithlen
dce894bfd3
Add first three unit tests
2024-11-12 16:47:28 +01:00
Boris
52180eb6b5
feat: COG-184 add falkordb ( #192 )
...
* feat: add falkordb adapter
---------
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-11-11 18:20:52 +01:00