Commit graph

4555 commits

Author SHA1 Message Date
Igor Ilic
40c0279ec5 Merge branch 'COG-793-metadata-rework' of github.com:topoteretes/cognee into COG-793-metadata-rework 2025-01-22 16:13:11 +01:00
Igor Ilic
80e67b0619 refactor: Rename foreign to external metadata
Rename foreign metadata to external metadata for metadata coming outside of Cognee
2025-01-22 16:07:35 +01:00
Rita Aleksziev
b2f7f733d9 create output dir if doesnt exist 2025-01-22 10:58:44 +01:00
alekszievr
5b6fe00576
Merge branch 'dev' into feat/save_and_load_answers 2025-01-22 10:52:05 +01:00
Rita Aleksziev
e0980361a1 include insights search in cognee option 2025-01-22 10:51:32 +01:00
Igor Ilic
93249c72c5 fix: Initial commit to resolve issue with using tokenizer based on LLMs
Currently TikToken is used for tokenizing by default which is only supported by OpenAI,
this is an initial commit in an attempt to add Cognee tokenizing support for multiple LLMs
2025-01-21 19:53:22 +01:00
Igor Ilic
655ab0b8cc
Merge branch 'dev' into COG-793-metadata-rework 2025-01-21 18:20:49 +01:00
Rita Aleksziev
9fec8fd322 Fix random seed usage and handle empty descriptions 2025-01-21 17:04:00 +01:00
Vasilije
c9536f97a5
Merge pull request #451 from topoteretes/add_docstrings
chore: add docstrings any typing to cognee tasks
2025-01-21 14:07:19 +01:00
Rita Aleksziev
1c16a1744c Save and load contexts and answers 2025-01-20 18:42:09 +01:00
Igor Ilic
bd3a5a758c
Merge branch 'dev' into COG-793-metadata-rework 2025-01-20 18:06:21 +01:00
Igor Ilic
77f0b45a0d refactor: Resolve issue with notebook after metadata refactor
Resolve issue with LlamaIndex notebook after refactor
2025-01-20 18:02:57 +01:00
Igor Ilic
4196a4ce89 refactor: Update test to be up to date with current metadata refactor effort 2025-01-20 17:53:54 +01:00
Rita Aleksziev
015f0084c8 eval on random samples instead of first couple 2025-01-20 17:47:52 +01:00
lxobr
1a73779353 fix: add data points task 2025-01-20 17:43:21 +01:00
lxobr
c747a05717 feat: make tasks a configurable argument in the cognify function 2025-01-20 17:43:21 +01:00
Igor Ilic
5c17501bb8 refactor: add missing foreing_metadata attr to tests 2025-01-20 17:38:28 +01:00
Igor Ilic
ab8d95cc30 refactor: As neo4j can't support dictionaries, add foreign metadata as string 2025-01-20 17:28:14 +01:00
Igor Ilic
49ad292592 refactor: Reduce complexity of metadata handling
Have foreign metadata be a table column in data instead of it's own table to reduce complexity

Refactor COG-793
2025-01-20 16:39:05 +01:00
Igor Ilic
0c7c1d7503 refactor: Refactor ingestion to only have one ingestion task 2025-01-20 14:33:47 +01:00
hajdul88
813a03c6e2
Merge branch 'dev' into pgvector-add-normalization 2025-01-20 13:46:50 +01:00
Igor Ilic
2546844787 feat: Add normalization to PGVector search
Add normalization to PGVector search results
2025-01-20 13:42:39 +01:00
Vasilije
bbb8e8951c
Merge pull request #458 from topoteretes/feature/cog-1034-implementing-windows-test-into-cicd-pipeline
Adds windows test + fixes networkx file loading issue
2025-01-20 13:16:07 +01:00
hajdul88
957ab81879 chore: renaming test 2025-01-20 12:27:22 +01:00
hajdul88
bf70705ed0 Fix: fixes networkx failed to load graph from file error 2025-01-20 12:19:34 +01:00
hajdul88
7932cf7159 fix: sets graphdb back to networkx 2025-01-20 10:23:38 +01:00
hajdul88
b949b29fa7 fix: changes graph DB to neo4j in windows test 2025-01-20 09:48:20 +01:00
hajdul88
1e4f71dacb feat: adds windows test for dynamic_steps_example 2025-01-20 09:32:09 +01:00
alekszievr
75bc7f67eb
feat: Add incremental eval option to paramset (#446)
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.

* Load dataset file by filename, outsource utilities

* restructure metric selection

* Add comprehensiveness, diversity and empowerment metrics

* add promptfoo as an option

* refactor RAG solution in eval;2C

* LLM as a judge metrics implemented in a uniform way

* Use requests.get instead of wget

* clean up promptfoo config template

* minor fixes

* get promptfoo path instead of hardcoding

* minor fixes

* Add LLM as a judge prompts

* Support 4 different rag options in eval

* Minor refactor and logger usage

* feat: make tasks a configurable argument in the cognify function

* Run eval on a set of parameters and save results as json and png

* fix: add data points task

* script for running all param combinations

* enable context provider to get tasks as param

* bugfix in simple rag

* Incremental eval of cognee pipeline

* potential fix: single asyncio run

* temp fix: exclude insights

* Remove insights, have single asyncio run, refactor

* Include incremental eval in accepted paramsets

* minor fixes

* handle pipeline slices in utils

* Handle insights and customize search types

* Handle retrieved edges more safely

* bugfix

* fix simple rag

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-01-17 18:04:31 +01:00
Igor Ilic
e7f24548dd
Merge branch 'dev' into add_docstrings 2025-01-17 17:00:23 +01:00
alekszievr
2e010f8dd1
Incremental eval of cognee pipeline (#445)
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.

* Load dataset file by filename, outsource utilities

* restructure metric selection

* Add comprehensiveness, diversity and empowerment metrics

* add promptfoo as an option

* refactor RAG solution in eval;2C

* LLM as a judge metrics implemented in a uniform way

* Use requests.get instead of wget

* clean up promptfoo config template

* minor fixes

* get promptfoo path instead of hardcoding

* minor fixes

* Add LLM as a judge prompts

* Support 4 different rag options in eval

* Minor refactor and logger usage

* feat: make tasks a configurable argument in the cognify function

* Run eval on a set of parameters and save results as json and png

* fix: add data points task

* script for running all param combinations

* enable context provider to get tasks as param

* bugfix in simple rag

* Incremental eval of cognee pipeline

* potential fix: single asyncio run

* temp fix: exclude insights

* Remove insights, have single asyncio run, refactor

* minor fixes

* handle pipeline slices in utils

* include all options in params json

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-01-17 14:16:48 +01:00
Vasilije
ffa3c2daa0
Merge pull request #449 from topoteretes/feature/cog-186-run-cognee-on-windows
Feature/cog 186 run cognee on windows
2025-01-17 14:16:37 +01:00
Igor Ilic
c0b79b4cff
Merge branch 'dev' into add_docstrings 2025-01-17 12:14:26 +01:00
hajdul88
b0634da43e fix: fixes typo in README 2025-01-17 11:30:45 +01:00
hajdul88
6f5d2bad47 Fix: Updates README 2025-01-17 11:29:51 +01:00
hajdul88
0b56e4b688 feat: Adds OS information to README 2025-01-17 11:22:34 +01:00
hajdul88
22ea4f0675
Merge branch 'dev' into feature/cog-186-run-cognee-on-windows 2025-01-17 10:49:53 +01:00
Vasilije
70e68fe8ff
Merge pull request #450 from topoteretes/ruff-version
fix: Update ruff version for cognee
2025-01-17 10:47:29 +01:00
Igor Ilic
be2aa9901f
Merge branch 'dev' into ruff-version 2025-01-17 10:40:50 +01:00
Igor Ilic
89b23b8728 refactor: Run ruff format 0.9.2 2025-01-17 10:40:24 +01:00
Igor Ilic
964fca72c6 fix: Update ruff version for cognee 2025-01-17 10:36:04 +01:00
hande-k
2c351c499d add docstrings any typing to cognee tasks 2025-01-17 10:30:34 +01:00
lxobr
65a0c98455
COG-989 feat: make tasks a configurable argument in the cognify function (#442)
* feat: make tasks a configurable argument in the cognify function

* fix: add data points task

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-01-17 10:20:57 +01:00
hajdul88
4ea01b9d30 fix: fixes cognee backend on windows 2025-01-17 09:52:05 +01:00
hajdul88
08c22a542a fix: fixes typo in multimedia example 2025-01-17 09:31:48 +01:00
hajdul88
981f35c1e0 fix: fixes windows compatibility in examples 2025-01-17 09:28:10 +01:00
hajdul88
704f2c68e2 fix: fixes old 0.8.6 ruff format to 0.9.2 2025-01-17 09:25:05 +01:00
hajdul88
6e691885e6
Merge branch 'dev' into feature/cog-186-run-cognee-on-windows 2025-01-17 09:06:00 +01:00
Vasilije
7c3e46f14e
Update README.md 2025-01-17 08:15:13 +01:00
alekszievr
8ec1e48ff6
Run eval on a set of parameters and save them as png and json (#443)
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.

* Load dataset file by filename, outsource utilities

* restructure metric selection

* Add comprehensiveness, diversity and empowerment metrics

* add promptfoo as an option

* refactor RAG solution in eval;2C

* LLM as a judge metrics implemented in a uniform way

* Use requests.get instead of wget

* clean up promptfoo config template

* minor fixes

* get promptfoo path instead of hardcoding

* minor fixes

* Add LLM as a judge prompts

* Support 4 different rag options in eval

* Minor refactor and logger usage

* Run eval on a set of parameters and save results as json and png

* script for running all param combinations

* bugfix in simple rag

* potential fix: single asyncio run

* temp fix: exclude insights

* Remove insights, have single asyncio run, refactor

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
2025-01-17 00:18:51 +01:00