Commit graph

2375 commits

Author SHA1 Message Date
Igor Ilic
844d99cb72 docs: Remove commented code 2025-01-23 18:24:26 +01:00
Igor Ilic
7dea1d54d7 refactor: Add specific max token values to embedding models 2025-01-23 18:18:45 +01:00
Igor Ilic
6d5679f9d2 Merge branch 'dev' into COG-970-refactor-tokenizing 2025-01-23 18:14:49 +01:00
hajdul88
2410feea4f feat: implements modal wrapper + dockerfile for modal containers 2025-01-23 18:06:09 +01:00
Igor Ilic
1319944dcd docs: Update .env.template to include llm and embedding options 2025-01-23 18:05:45 +01:00
Igor Ilic
b25a82e206 chore: Add google-generativeai as gemini optional dependency to Cognee 2025-01-23 17:56:56 +01:00
Igor Ilic
b686376c54 feat: Add gemini tokenizer to cognee 2025-01-23 17:55:04 +01:00
Vasilije
d50af60b59
Merge pull request #465 from topoteretes/feat/COG-1058-fastmcp
feat: use fastmcp for mcp server
2025-01-23 17:38:55 +01:00
Igor Ilic
294ed1d960 feat: Add HuggingFace Tokenizer support 2025-01-23 16:52:35 +01:00
Igor Ilic
2e1a48e22c docs: Add usage example of function 2025-01-23 15:13:46 +01:00
Igor Ilic
de19016494 fix: Add flag to allow SQLite to use foreign keys 2025-01-23 15:10:27 +01:00
Igor Ilic
d4453e4a1d fix: Add support for SQLite and PostgreSQL for inserting data in SQLAlchemyAdapter 2025-01-23 14:59:02 +01:00
Boris Arzentar
eb22dc9889 fix: remove unnecessary dot 2025-01-23 11:47:20 +01:00
Boris Arzentar
e577276d91 Merge remote-tracking branch 'origin/dev' into feat/COG-1058-fastmcp 2025-01-23 11:46:25 +01:00
Boris Arzentar
00f302c37a feat: use fastmcp for mcp server 2025-01-23 11:45:40 +01:00
hande-k
cdecf5fb8f add short decsription in md 2025-01-23 11:17:48 +01:00
hande-k
343de01d5a update notebooks with latest eval 2025-01-23 11:11:51 +01:00
Igor Ilic
9f6a0ba783
Merge branch 'dev' into pgvector-add-normalization 2025-01-23 11:11:43 +01:00
Vasilije
90657a262c
Merge pull request #460 from topoteretes/COG-793-metadata-rework
Cog 793 metadata rework
2025-01-23 11:00:07 +01:00
alekszievr
4e3a666b33
Feat: Save and load contexts and answers for eval (#462)
* feat: make tasks a configurable argument in the cognify function

* fix: add data points task

* eval on random samples instead of first couple

* Save and load contexts and answers

* Fix random seed usage and handle empty descriptions

* include insights search in cognee option

* create output dir if doesnt exist

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
2025-01-22 16:17:01 +01:00
Igor Ilic
40c0279ec5 Merge branch 'COG-793-metadata-rework' of github.com:topoteretes/cognee into COG-793-metadata-rework 2025-01-22 16:13:11 +01:00
Igor Ilic
80e67b0619 refactor: Rename foreign to external metadata
Rename foreign metadata to external metadata for metadata coming outside of Cognee
2025-01-22 16:07:35 +01:00
Rita Aleksziev
b2f7f733d9 create output dir if doesnt exist 2025-01-22 10:58:44 +01:00
alekszievr
5b6fe00576
Merge branch 'dev' into feat/save_and_load_answers 2025-01-22 10:52:05 +01:00
Rita Aleksziev
e0980361a1 include insights search in cognee option 2025-01-22 10:51:32 +01:00
Igor Ilic
93249c72c5 fix: Initial commit to resolve issue with using tokenizer based on LLMs
Currently TikToken is used for tokenizing by default which is only supported by OpenAI,
this is an initial commit in an attempt to add Cognee tokenizing support for multiple LLMs
2025-01-21 19:53:22 +01:00
Igor Ilic
655ab0b8cc
Merge branch 'dev' into COG-793-metadata-rework 2025-01-21 18:20:49 +01:00
Rita Aleksziev
9fec8fd322 Fix random seed usage and handle empty descriptions 2025-01-21 17:04:00 +01:00
Vasilije
c9536f97a5
Merge pull request #451 from topoteretes/add_docstrings
chore: add docstrings any typing to cognee tasks
2025-01-21 14:07:19 +01:00
Rita Aleksziev
1c16a1744c Save and load contexts and answers 2025-01-20 18:42:09 +01:00
Igor Ilic
bd3a5a758c
Merge branch 'dev' into COG-793-metadata-rework 2025-01-20 18:06:21 +01:00
Igor Ilic
77f0b45a0d refactor: Resolve issue with notebook after metadata refactor
Resolve issue with LlamaIndex notebook after refactor
2025-01-20 18:02:57 +01:00
Igor Ilic
4196a4ce89 refactor: Update test to be up to date with current metadata refactor effort 2025-01-20 17:53:54 +01:00
Rita Aleksziev
015f0084c8 eval on random samples instead of first couple 2025-01-20 17:47:52 +01:00
lxobr
1a73779353 fix: add data points task 2025-01-20 17:43:21 +01:00
lxobr
c747a05717 feat: make tasks a configurable argument in the cognify function 2025-01-20 17:43:21 +01:00
Igor Ilic
5c17501bb8 refactor: add missing foreing_metadata attr to tests 2025-01-20 17:38:28 +01:00
Igor Ilic
ab8d95cc30 refactor: As neo4j can't support dictionaries, add foreign metadata as string 2025-01-20 17:28:14 +01:00
Igor Ilic
49ad292592 refactor: Reduce complexity of metadata handling
Have foreign metadata be a table column in data instead of it's own table to reduce complexity

Refactor COG-793
2025-01-20 16:39:05 +01:00
Igor Ilic
0c7c1d7503 refactor: Refactor ingestion to only have one ingestion task 2025-01-20 14:33:47 +01:00
hajdul88
813a03c6e2
Merge branch 'dev' into pgvector-add-normalization 2025-01-20 13:46:50 +01:00
Igor Ilic
2546844787 feat: Add normalization to PGVector search
Add normalization to PGVector search results
2025-01-20 13:42:39 +01:00
Vasilije
bbb8e8951c
Merge pull request #458 from topoteretes/feature/cog-1034-implementing-windows-test-into-cicd-pipeline
Adds windows test + fixes networkx file loading issue
2025-01-20 13:16:07 +01:00
hajdul88
957ab81879 chore: renaming test 2025-01-20 12:27:22 +01:00
hajdul88
bf70705ed0 Fix: fixes networkx failed to load graph from file error 2025-01-20 12:19:34 +01:00
hajdul88
7932cf7159 fix: sets graphdb back to networkx 2025-01-20 10:23:38 +01:00
hajdul88
b949b29fa7 fix: changes graph DB to neo4j in windows test 2025-01-20 09:48:20 +01:00
hajdul88
1e4f71dacb feat: adds windows test for dynamic_steps_example 2025-01-20 09:32:09 +01:00
alekszievr
75bc7f67eb
feat: Add incremental eval option to paramset (#446)
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.

* Load dataset file by filename, outsource utilities

* restructure metric selection

* Add comprehensiveness, diversity and empowerment metrics

* add promptfoo as an option

* refactor RAG solution in eval;2C

* LLM as a judge metrics implemented in a uniform way

* Use requests.get instead of wget

* clean up promptfoo config template

* minor fixes

* get promptfoo path instead of hardcoding

* minor fixes

* Add LLM as a judge prompts

* Support 4 different rag options in eval

* Minor refactor and logger usage

* feat: make tasks a configurable argument in the cognify function

* Run eval on a set of parameters and save results as json and png

* fix: add data points task

* script for running all param combinations

* enable context provider to get tasks as param

* bugfix in simple rag

* Incremental eval of cognee pipeline

* potential fix: single asyncio run

* temp fix: exclude insights

* Remove insights, have single asyncio run, refactor

* Include incremental eval in accepted paramsets

* minor fixes

* handle pipeline slices in utils

* Handle insights and customize search types

* Handle retrieved edges more safely

* bugfix

* fix simple rag

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-01-17 18:04:31 +01:00
Igor Ilic
e7f24548dd
Merge branch 'dev' into add_docstrings 2025-01-17 17:00:23 +01:00