Commit graph

80 commits

Author SHA1 Message Date
Igor Ilic
860218632f refactor: add suggestions from PR
Add suggestsions made by CodeRabbit on pull request
2025-01-28 17:15:25 +01:00
Igor Ilic
a8644e0bd7 feat: Use litellm max token size as default for model, if model exists in litellm 2025-01-28 17:00:47 +01:00
Igor Ilic
4e56cd64a1 refactor: Add max chunk tokens to code graph pipeline 2025-01-28 15:33:34 +01:00
Igor Ilic
3db7f85c9c feat: Add max_chunk_tokens value to chunkers
Add formula and forwarding of max_chunk_tokens value through Cognee
2025-01-28 14:32:00 +01:00
Igor Ilic
0a9f1349f2 refactor: Change variable and function names based on PR comments
Change variable and function names based on PR comments
2025-01-28 10:10:29 +01:00
Igor Ilic
7dea1d54d7 refactor: Add specific max token values to embedding models 2025-01-23 18:18:45 +01:00
Igor Ilic
b686376c54 feat: Add gemini tokenizer to cognee 2025-01-23 17:55:04 +01:00
Igor Ilic
294ed1d960 feat: Add HuggingFace Tokenizer support 2025-01-23 16:52:35 +01:00
Igor Ilic
93249c72c5 fix: Initial commit to resolve issue with using tokenizer based on LLMs
Currently TikToken is used for tokenizing by default which is only supported by OpenAI,
this is an initial commit in an attempt to add Cognee tokenizing support for multiple LLMs
2025-01-21 19:53:22 +01:00
alekszievr
6653d73556
Feat/cog 950 improve metric selection (#435)
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.

* Load dataset file by filename, outsource utilities

* restructure metric selection

* Add comprehensiveness, diversity and empowerment metrics

* add promptfoo as an option

* refactor RAG solution in eval;2C

* LLM as a judge metrics implemented in a uniform way

* Use requests.get instead of wget

* clean up promptfoo config template

* minor fixes

* get promptfoo path instead of hardcoding

* minor fixes

* Add LLM as a judge prompts

* Minor refactor and logger usage
2025-01-15 10:45:55 +01:00
hajdul88
16155f084f
Merge branch 'dev' into feature/cog-971-preparing-swe-bench-run 2025-01-10 13:42:40 +01:00
hajdul88
6177d04b44 feat: implements code retreiver 2025-01-10 13:03:34 +01:00
hajdul88
fe57eb69e7
Merge branch 'dev' into feature/cog-967-adding-graph-completion-feature-to-cognee 2025-01-09 11:07:19 +01:00
hajdul88
d39140f28b feat: implements the first version of graph based completion in search 2025-01-08 16:10:29 +01:00
vasilije
41b1486cff Fix visualization 2025-01-08 13:13:52 +01:00
vasilije
1b96a71d5a Fix ollama, work on visualization 2025-01-06 19:09:58 +01:00
vasilije
76a0aa7e8b Fix linter issues 2025-01-05 19:48:35 +01:00
vasilije
649fcf2ba8 Fix linter issues 2025-01-05 19:21:09 +01:00
vasilije
60c8fd103b ruff format 2025-01-05 19:09:08 +01:00
hajdul88
c8a1f04b4c fix: updates the acreate_structured_output 2024-12-20 16:19:50 +01:00
Vasilije
ffb44529cc
Merge branch 'dev' into LANGFUSE_FIX 2024-12-18 19:07:13 +01:00
vasilije
c448dfb96d Fix langfuse 2024-12-18 19:01:29 +01:00
alekszievr
9afd0ece63
Structured code summarization (#375)
* feat: turn summarize_code into generator

* feat: extract run_code_graph_pipeline, update the pipeline

* feat: minimal code graph example

* refactor: update argument

* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline

* refactor: indentation and whitespace nits

* refactor: add deprecated use comments and warnings

* Structured code summarization

* add missing prompt file

* Remove summarization_model argument from summarize_code and fix typehinting

* minor refactors

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2024-12-17 13:05:47 +01:00
Igor Ilic
67585d0ab1 feat: Add simple instruction for system prompt
Add simple instruction for system prompt

Feature COG-656
2024-12-13 15:30:24 +01:00
Boris Arzentar
b89a4b8054 Merge remote-tracking branch 'origin/main' into code-graph 2024-12-03 21:14:19 +01:00
alekszievr
706101113a
feat/add correctness score calculation with LLM as a judge (#30) 2024-12-03 17:47:18 +01:00
Boris Arzentar
e07364fc25 Merge remote-tracking branch 'origin/main' into code-graph 2024-12-03 12:44:57 +01:00
Rita Aleksziev
f966f099fc Prompt renaming to more specific names. Minor code changes. 2024-12-02 12:18:00 +01:00
Vasilije
bbaf78f54e
Cog 669 implement dummy llm adapter (#37)
Adds the `class DummyLLMAdapter(LLMInterface)` class for profiling of
large datasets without actual LLM calls in the top level
`profiling/util` location.

I also move the `show_prompt` implementation from the child classes to
`LLMInterface`, since the implementations were identical.

I expanded the scope to also include a DummyEmbeddingEngine.
2024-11-30 17:02:49 +01:00
Rita Aleksziev
a4c56f118d Connect code graph pipeline + retriever + benchmarking 2024-11-29 15:24:49 +01:00
Leon Luithlen
a5ae9185cd Replicate PR 33 2024-11-29 11:40:51 +01:00
Leon Luithlen
5c9fd44680 Fix DummyLLMAdapter 2024-11-28 12:26:01 +01:00
Leon Luithlen
a2ff42332e DummyLLMAdapter WIP 2024-11-28 11:49:28 +01:00
Igor Ilic
204b5e9fe1 Merge branch 'main' of github.com:topoteretes/cognee-private into COG-502-backend-error-handling 2024-11-27 14:30:53 +01:00
Igor Ilic
ae568409a7 feat: Add custom exceptions to cognee lib
Added use of custom exceptions to cognee lib
2024-11-27 14:29:33 +01:00
Rita Aleksziev
f47b185a9e feat/add correctness score calculation with LLM as a judge 2024-11-27 10:53:48 +01:00
Boris
64b8aac86f
feat: code graph swe integration
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: hande-k <handekafkas7@gmail.com>
Co-authored-by: Igor Ilic <igorilic03@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2024-11-27 09:32:29 +01:00
Igor Ilic
66c321f206 fix: Add fix for getting transcription of audio and image from LLMs
Enable getting of text from audio and image files from LLMs

Fix
2024-11-25 17:32:11 +01:00
Boris
d1f8217320
feat: COG-585 enable custom llm and embeding models 2024-11-22 10:26:21 +01:00
Rita Aleksziev
2948089806 Read patch generation instructions from file 2024-11-19 14:07:53 +01:00
0xideas
34e140a41d
Switch to gpt-4o-mini by default (#233)
* Switch to gpt-4o-mini by default

* Add option and make gpt-4o-mini default in frontend

* Run llama index notebook without extra arguments in poetry install

* Install extras for llama_index_notebook run
2024-11-18 17:38:54 +01:00
Boris
2f832b190c
fix: various fixes for the deployment
* fix: remove groups from UserRead model

* fix: add missing system dependencies for postgres

* fix: change vector db provider environment variable name

* fix: WeaviateAdapter retrieve bug

* fix: correctly return data point objects from retrieve method

* fix: align graph object properties

* feat: add node example
2024-10-22 11:26:48 +02:00
Boris
1eb4429c5c
feat: improve API request and response models and docs (#154)
* feat: improve API request and response models and docs
2024-10-14 13:38:36 +02:00
Boris
94a674a088
feat: split document reader from chunker (#131)
* fix: abstract chunking into a separate class

* fix: yield merged text from text chunker

* fix: split python version tests

* fix: change postgres live check

* fix: remove unnecessary code

* fix: update checkout action

* fix: update setup-python action

* fix: add PG_USER env variable

* fix: make sure relationship_name is used everywhere

* fix: remove duplicate import
2024-08-19 14:36:10 +02:00
Boris
26bca0184f
feat: add entity and entity type nodes to vector db (#126)
* feat: add entity and entity type nodes to vector db

* fix: use uuid5 as entity ids

* fix: id -> uuid and LanceDB collection model
2024-08-01 14:21:39 +02:00
Boris
14555a25d0
feat: pipelines and tasks (#119)
* feat: simple graph pipeline

* feat: implement incremental graph generation

* fix: various bug fixes

* fix: upgrade weaviate-client

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2024-07-20 16:49:00 +02:00
Vasilije
d0939b9b3b added updates to topology 2024-06-12 13:42:25 +02:00
Boris Arzentar
c9d9672fed fix: cognify status table update 2024-06-03 21:49:10 +02:00
Boris Arzentar
8499b7f2fc Merge remote-tracking branch 'origin/add_collab_fixes' into fix/sdk-and-config 2024-06-03 14:59:22 +02:00
Boris Arzentar
4fb3dc31a4 fix: enable sdk and fix config 2024-06-03 14:03:24 +02:00