Commit graph

2375 commits

Author SHA1 Message Date
Igor Ilic
5567370214 chore: Update gh actions to install docs extra
Update library gh actions to install docs extra to test unstructured integration tests

Chore COG-685
2024-12-09 09:32:28 +01:00
Igor Ilic
596b3edf72 test: Add test for Unstructured pptx document type
Added pptx example file and tested Unstructured pptx document type handling

Test COG-685
2024-12-08 15:18:42 +01:00
Igor Ilic
07d9330e4a feat: Add UnstructuredLibraryImportError
Added exception when unstructured libary is called but not installed

Feature COG-685
2024-12-08 14:53:19 +01:00
Igor Ilic
53b7806ccb chore: Update pyproject file with unstructured library
Add unstructured library as docs optional extension to pyproject.toml

Chore COG-685
2024-12-08 14:42:08 +01:00
Igor Ilic
62db3f8598 feat: Remove the need for libmagic for unstructured documents
Remove the need for libmagic so for unstructured documents by providing mime_type information

Feature COG-685
2024-12-08 14:37:50 +01:00
Boris
ea879b2882
Merge branch 'main' into COG-698 2024-12-08 14:23:03 +01:00
Vasilije
ce96431055
Merge pull request #265 from topoteretes/feat/COG-418-log-config-to-telemetry
Add cognee config to telemetry
2024-12-07 09:45:49 +01:00
Vasilije
86a63043ed
Merge pull request #266 from RaphaelS1/patch-1
Add code of conduct, NOTICE, and licenses/
2024-12-06 18:56:44 +01:00
Igor Ilic
78214456a6 feat: Add unstructured document handler
Added unstructured library and handling of certain document types through their library

Feature COG-685
2024-12-06 17:50:22 +01:00
alekszievr
f30bf35f92
Merge branch 'main' into feat/COG-418-log-config-to-telemetry 2024-12-06 16:11:56 +01:00
Raphael Sonabend
f4583ebd3a
Merge branch 'main' into patch-1 2024-12-06 13:45:13 +00:00
Igor Ilic
8415279cb2
Merge pull request #260 from topoteretes/COG-505-data-dataset-model-changes
Cog 505 data dataset model changes
2024-12-06 14:42:35 +01:00
Raphael Sonabend
4ebbf53b10 add NOTICE file, reference CoC in contribution guidelines, add licenses folder for external licenses
Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>
2024-12-06 13:27:55 +00:00
Raphael Sonabend
4daee66717
Create CODE_OF_CONDUCT.md 2024-12-06 13:21:06 +00:00
alekszievr
e6def6423c
Merge branch 'main' into feat/COG-418-log-config-to-telemetry 2024-12-06 13:58:38 +01:00
Igor Ilic
d7fa9f3cfd Merge branch 'COG-505-data-dataset-model-changes' of github.com:topoteretes/cognee into COG-505-data-dataset-model-changes 2024-12-06 13:49:07 +01:00
Igor Ilic
cc6fbe2a5f refactor: Add space to ingest function
Add space and newline to ingest function

Refactor COG-505
2024-12-06 13:48:39 +01:00
Rita Aleksziev
462fcef240 move config getter into cognee/modules/pipelines/operations/run_tasks.py and make the indentation a bit more readable 2024-12-06 13:38:54 +01:00
Rita Aleksziev
dbfa91b635 Add cognee config to telemetry 2024-12-06 12:55:25 +01:00
Boris
9429e5e1f5
Merge branch 'main' into COG-505-data-dataset-model-changes 2024-12-06 12:53:32 +01:00
Boris
348610e73c
fix: refactor get_graph_from_model to return nodes and edges correctly (#257)
* fix: handle rate limit error coming from llm model

* fix: fixes lost edges and nodes in get_graph_from_model

* fix: fixes database pruning issue in pgvector (#261)

* fix: cognee_demo notebook pipeline is not saving summaries

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-12-06 12:52:01 +01:00
Boris
0268df298f
Merge branch 'main' into COG-698 2024-12-06 12:14:15 +01:00
Igor Ilic
351ce92001
Merge pull request #263 from topoteretes/gh-actions-all-branches
test: Update gh actions so they can run outside of PR to main
2024-12-06 12:04:47 +01:00
Igor Ilic
d254471023 test: Update gh actions so they can run outside of PR to main
Allow github actions to run on PRs that aren't targeting main

Test
2024-12-06 11:09:26 +01:00
Igor Ilic
1e098ae70d refactor: Add error handling to hash util
Added error handling to reading of file in hash util

Refactor COG-505
2024-12-05 20:54:55 +01:00
Igor Ilic
e80377b729 refactor: Move hash calculation of file to util
Moved hash calculation of file to shared utils, added better typing

Refactor COG-505
2024-12-05 20:33:30 +01:00
Igor Ilic
9ba5d49e69 test: Fix test for multimedia deduplication
Add missing function to get data from database to multimedia deduplication test

Test COG-505
2024-12-05 20:09:29 +01:00
Igor Ilic
add6730b9e test: Add testing of dataset data table content
Add testing of dataset data table content

Test COG-505
2024-12-05 19:37:12 +01:00
Igor Ilic
387002d8ca Merge branch 'COG-505-data-dataset-model-changes' of github.com:topoteretes/cognee into COG-505-data-dataset-model-changes 2024-12-05 19:26:17 +01:00
Igor Ilic
813b76c9c2 test: Add test for text deduplication
Added end to end test for text deduplication

Test COG-505
2024-12-05 19:25:50 +01:00
Igor Ilic
349ddfe794
Merge branch 'main' into COG-505-data-dataset-model-changes 2024-12-05 17:10:43 +01:00
Igor Ilic
378e7b81a5 fix: Fix merge of data for dlt
Resolve issue with dlt data not being merged for data_id

Fix COG-505
2024-12-05 17:03:36 +01:00
Igor Ilic
f5b5e56cc1 feat: Add deduplication of data
Data is deduplicated per user so if a user tries to add data which already exists it will just be redirected to existing data in database

Feature COG-505
2024-12-05 16:38:44 +01:00
Vasilije
316f2f3661
Merge branch 'main' into COG-698 2024-12-05 14:29:24 +01:00
hajdul88
acf036818e
Merge pull request #251 from topoteretes/feature/cog-717-create-edge-embeddings-in-vector-databases
Creates edge embeddings collection
2024-12-05 09:13:11 +01:00
hajdul88
68c3f42ab8
Merge branch 'main' into feature/cog-717-create-edge-embeddings-in-vector-databases 2024-12-05 09:08:37 +01:00
Vasilije
c4ad473861
Merge pull request #253 from topoteretes/feat/COG-711-temporal-awareness-task
Integrate graphiti's temporal awareness functionality as Tasks
2024-12-04 20:50:03 +01:00
Vasilije
b571fb5626
Merge branch 'main' into feat/COG-711-temporal-awareness-task 2024-12-04 20:49:36 +01:00
hajdul88
7f192e1c2b
Merge branch 'main' into feature/cog-717-create-edge-embeddings-in-vector-databases 2024-12-04 20:49:30 +01:00
Vasilije
7223b2c83b
Merge pull request #256 from topoteretes/fix-notebook-gh-actions
chore: Fix issue with notebook github actions
2024-12-04 20:42:18 +01:00
Igor Ilic
6be025e3d4 chore: Attempt to fix issue with notebook github actions
Attempt to resolve issue with running notebooks in github actions

Chore
2024-12-04 20:36:23 +01:00
Vasilije
a96daef469 Bump release version 2024-12-04 19:56:09 +01:00
Vasilije
d2fccc1315 Bump release version 2024-12-04 19:53:33 +01:00
Vasilije
cf51555943 Bump release version 2024-12-04 19:50:25 +01:00
Vasilije
cc43a8c865 Bump release version 2024-12-04 19:48:01 +01:00
Vasilije
f37d96df6e Bump release version 2024-12-04 19:44:34 +01:00
Vasilije
8d1936f022 Bump release version 2024-12-04 19:35:06 +01:00
hajdul88
59035c3f45 fix: puts index_graph_edges unit tests under unit test directory 2024-12-04 19:32:15 +01:00
Vasilije
2df1eb6098 Bump release version 2024-12-04 19:31:14 +01:00
Vasilije
692770c197 Bump release version 2024-12-04 19:28:21 +01:00