Commit graph

1511 commits

Author SHA1 Message Date
Vasilije
759dbfb575
Merge pull request #298 from topoteretes/dependabot/github_actions/actions/cache-4
⬆️ Bump actions/cache from 3 to 4
2024-12-11 11:12:50 +01:00
dependabot[bot]
12b35826cc
⬆️ Bump actions/cache from 3 to 4
Bumps [actions/cache](https://github.com/actions/cache) from 3 to 4.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-12-11 09:14:23 +00:00
Vasilije
0913a5d79c
Merge pull request #291 from topoteretes/COG-868
Create autoupdate.yaml
2024-12-11 10:13:27 +01:00
alekszievr
4f2745504c
Calculate official hotpot EM and F1 scores (#292) 2024-12-10 19:16:12 +01:00
Vasilije
38e11f93da
Create community_greetings.yml
Added community greetings
2024-12-10 14:12:08 +01:00
Vasilije
b70fb7c3fe
Update dependabot.yaml 2024-12-10 14:04:33 +01:00
Vasilije
e7793b6389
Create dependabot.yaml 2024-12-10 14:03:26 +01:00
Vasilije
bc5b6d4632
Delete .github/workflows/autoupdate.yaml 2024-12-10 14:02:00 +01:00
Vasilije
55a0374705
Create autoupdate.yaml 2024-12-10 13:41:59 +01:00
hajdul88
6d85165189
Feature/cog 539 implementing additional retriever approaches (#262)
* fix: refactor get_graph_from_model to return nodes and edges correctly

* fix: add missing params

* fix: remove complex zip usage

* fix: add edges to data_point properties

* fix: handle rate limit error coming from llm model

* fix: fixes lost edges and nodes in get_graph_from_model

* fix: fixes database pruning issue in pgvector

* fix: fixes database pruning issue in pgvector (#261)

* feat: adds code summary embeddings to vector DB

* fix: cognee_demo notebook pipeline is not saving summaries

* feat: implements first version of codegraph retriever

* chore: implements minor changes mostly to make the code production ready

* fix: turns off raising duplicated edges unit test as we have these in our current codegraph generation

* feat: implements unit tests for description to codepart search

* fix: fixes edge property inconsistent access in codepart retriever

* chore: implements more precise typing for get_attribute method for cogneegraph

* chore: adds spacing to tests and changes the cogneegraph getter names

---------

Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2024-12-10 11:07:06 +01:00
Vasilije
5ffbebdd01
Merge pull request #269 from topoteretes/COG-685-more-document-types
Cog 685 more document types
2024-12-09 18:03:25 +01:00
Igor Ilic
acf5952b31 test: Update typo in unstructured test
Update typo for file name in test

Test COG-685
2024-12-09 16:37:46 +01:00
Igor Ilic
e0a3563249 Merge branch 'COG-685-more-document-types' of github.com:topoteretes/cognee into COG-685-more-document-types 2024-12-09 15:21:26 +01:00
Igor Ilic
d7d559f4f7 test: Add tests for different document types
Add tests for unstructured reading for different document types

Test COG-685
2024-12-09 15:20:50 +01:00
Igor Ilic
344865f1a4
Merge branch 'main' into COG-685-more-document-types 2024-12-09 10:22:26 +01:00
Igor Ilic
df289deb18 chore: Update dependencies to handle different document types
Update unstructured so it would install support for different document types

Chore COG-685
2024-12-09 09:49:26 +01:00
Igor Ilic
5567370214 chore: Update gh actions to install docs extra
Update library gh actions to install docs extra to test unstructured integration tests

Chore COG-685
2024-12-09 09:32:28 +01:00
Igor Ilic
596b3edf72 test: Add test for Unstructured pptx document type
Added pptx example file and tested Unstructured pptx document type handling

Test COG-685
2024-12-08 15:18:42 +01:00
Igor Ilic
07d9330e4a feat: Add UnstructuredLibraryImportError
Added exception when unstructured libary is called but not installed

Feature COG-685
2024-12-08 14:53:19 +01:00
Igor Ilic
53b7806ccb chore: Update pyproject file with unstructured library
Add unstructured library as docs optional extension to pyproject.toml

Chore COG-685
2024-12-08 14:42:08 +01:00
Igor Ilic
62db3f8598 feat: Remove the need for libmagic for unstructured documents
Remove the need for libmagic so for unstructured documents by providing mime_type information

Feature COG-685
2024-12-08 14:37:50 +01:00
Vasilije
ce96431055
Merge pull request #265 from topoteretes/feat/COG-418-log-config-to-telemetry
Add cognee config to telemetry
2024-12-07 09:45:49 +01:00
Vasilije
86a63043ed
Merge pull request #266 from RaphaelS1/patch-1
Add code of conduct, NOTICE, and licenses/
2024-12-06 18:56:44 +01:00
Igor Ilic
78214456a6 feat: Add unstructured document handler
Added unstructured library and handling of certain document types through their library

Feature COG-685
2024-12-06 17:50:22 +01:00
alekszievr
f30bf35f92
Merge branch 'main' into feat/COG-418-log-config-to-telemetry 2024-12-06 16:11:56 +01:00
Raphael Sonabend
f4583ebd3a
Merge branch 'main' into patch-1 2024-12-06 13:45:13 +00:00
Igor Ilic
8415279cb2
Merge pull request #260 from topoteretes/COG-505-data-dataset-model-changes
Cog 505 data dataset model changes
2024-12-06 14:42:35 +01:00
Raphael Sonabend
4ebbf53b10 add NOTICE file, reference CoC in contribution guidelines, add licenses folder for external licenses
Signed-off-by: Raphael Sonabend <raphaelsonabend@gmail.com>
2024-12-06 13:27:55 +00:00
Raphael Sonabend
4daee66717
Create CODE_OF_CONDUCT.md 2024-12-06 13:21:06 +00:00
alekszievr
e6def6423c
Merge branch 'main' into feat/COG-418-log-config-to-telemetry 2024-12-06 13:58:38 +01:00
Igor Ilic
d7fa9f3cfd Merge branch 'COG-505-data-dataset-model-changes' of github.com:topoteretes/cognee into COG-505-data-dataset-model-changes 2024-12-06 13:49:07 +01:00
Igor Ilic
cc6fbe2a5f refactor: Add space to ingest function
Add space and newline to ingest function

Refactor COG-505
2024-12-06 13:48:39 +01:00
Rita Aleksziev
462fcef240 move config getter into cognee/modules/pipelines/operations/run_tasks.py and make the indentation a bit more readable 2024-12-06 13:38:54 +01:00
Rita Aleksziev
dbfa91b635 Add cognee config to telemetry 2024-12-06 12:55:25 +01:00
Boris
9429e5e1f5
Merge branch 'main' into COG-505-data-dataset-model-changes 2024-12-06 12:53:32 +01:00
Boris
348610e73c
fix: refactor get_graph_from_model to return nodes and edges correctly (#257)
* fix: handle rate limit error coming from llm model

* fix: fixes lost edges and nodes in get_graph_from_model

* fix: fixes database pruning issue in pgvector (#261)

* fix: cognee_demo notebook pipeline is not saving summaries

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-12-06 12:52:01 +01:00
Igor Ilic
351ce92001
Merge pull request #263 from topoteretes/gh-actions-all-branches
test: Update gh actions so they can run outside of PR to main
2024-12-06 12:04:47 +01:00
Igor Ilic
d254471023 test: Update gh actions so they can run outside of PR to main
Allow github actions to run on PRs that aren't targeting main

Test
2024-12-06 11:09:26 +01:00
Igor Ilic
1e098ae70d refactor: Add error handling to hash util
Added error handling to reading of file in hash util

Refactor COG-505
2024-12-05 20:54:55 +01:00
Igor Ilic
e80377b729 refactor: Move hash calculation of file to util
Moved hash calculation of file to shared utils, added better typing

Refactor COG-505
2024-12-05 20:33:30 +01:00
Igor Ilic
9ba5d49e69 test: Fix test for multimedia deduplication
Add missing function to get data from database to multimedia deduplication test

Test COG-505
2024-12-05 20:09:29 +01:00
Igor Ilic
add6730b9e test: Add testing of dataset data table content
Add testing of dataset data table content

Test COG-505
2024-12-05 19:37:12 +01:00
Igor Ilic
387002d8ca Merge branch 'COG-505-data-dataset-model-changes' of github.com:topoteretes/cognee into COG-505-data-dataset-model-changes 2024-12-05 19:26:17 +01:00
Igor Ilic
813b76c9c2 test: Add test for text deduplication
Added end to end test for text deduplication

Test COG-505
2024-12-05 19:25:50 +01:00
Igor Ilic
349ddfe794
Merge branch 'main' into COG-505-data-dataset-model-changes 2024-12-05 17:10:43 +01:00
Igor Ilic
378e7b81a5 fix: Fix merge of data for dlt
Resolve issue with dlt data not being merged for data_id

Fix COG-505
2024-12-05 17:03:36 +01:00
Igor Ilic
f5b5e56cc1 feat: Add deduplication of data
Data is deduplicated per user so if a user tries to add data which already exists it will just be redirected to existing data in database

Feature COG-505
2024-12-05 16:38:44 +01:00
hajdul88
acf036818e
Merge pull request #251 from topoteretes/feature/cog-717-create-edge-embeddings-in-vector-databases
Creates edge embeddings collection
2024-12-05 09:13:11 +01:00
hajdul88
68c3f42ab8
Merge branch 'main' into feature/cog-717-create-edge-embeddings-in-vector-databases 2024-12-05 09:08:37 +01:00
Vasilije
c4ad473861
Merge pull request #253 from topoteretes/feat/COG-711-temporal-awareness-task
Integrate graphiti's temporal awareness functionality as Tasks
2024-12-04 20:50:03 +01:00