cognee

Author	SHA1	Message	Date
Boris Arzentar	3320bc8f2c	feat: add codegraph related API endpoints	2025-01-28 10:08:59 +01:00
alekszievr	4e3a666b33	Feat: Save and load contexts and answers for eval (#462 ) * feat: make tasks a configurable argument in the cognify function * fix: add data points task * eval on random samples instead of first couple * Save and load contexts and answers * Fix random seed usage and handle empty descriptions * include insights search in cognee option * create output dir if doesnt exist --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>	2025-01-22 16:17:01 +01:00
alekszievr	75bc7f67eb	feat: Add incremental eval option to paramset (#446 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage * feat: make tasks a configurable argument in the cognify function * Run eval on a set of parameters and save results as json and png * fix: add data points task * script for running all param combinations * enable context provider to get tasks as param * bugfix in simple rag * Incremental eval of cognee pipeline * potential fix: single asyncio run * temp fix: exclude insights * Remove insights, have single asyncio run, refactor * Include incremental eval in accepted paramsets * minor fixes * handle pipeline slices in utils * Handle insights and customize search types * Handle retrieved edges more safely * bugfix * fix simple rag --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com> Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-01-17 18:04:31 +01:00
alekszievr	2e010f8dd1	Incremental eval of cognee pipeline (#445 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage * feat: make tasks a configurable argument in the cognify function * Run eval on a set of parameters and save results as json and png * fix: add data points task * script for running all param combinations * enable context provider to get tasks as param * bugfix in simple rag * Incremental eval of cognee pipeline * potential fix: single asyncio run * temp fix: exclude insights * Remove insights, have single asyncio run, refactor * minor fixes * handle pipeline slices in utils * include all options in params json --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com> Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-01-17 14:16:48 +01:00
alekszievr	8ec1e48ff6	Run eval on a set of parameters and save them as png and json (#443 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage * Run eval on a set of parameters and save results as json and png * script for running all param combinations * bugfix in simple rag * potential fix: single asyncio run * temp fix: exclude insights * Remove insights, have single asyncio run, refactor --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>	2025-01-17 00:18:51 +01:00
alekszievr	3494521cae	Support 4 different rag options in eval (#439 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage	2025-01-15 15:34:13 +01:00
alekszievr	6653d73556	Feat/cog 950 improve metric selection (#435 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Minor refactor and logger usage	2025-01-15 10:45:55 +01:00
alekszievr	a4ad1702ed	Feat/cog 946 abstract eval dataset (#418 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * Use requests.get instead of wget	2025-01-14 11:33:55 +01:00
hajdul88	e2ad54d88e	Fix: deleting incorrect repo path	2025-01-10 15:54:45 +01:00
hajdul88	6177d04b44	feat: implements code retreiver	2025-01-10 13:03:34 +01:00
hajdul88	9604d95ba5	feat: adds basic retriever for swe bench	2025-01-09 19:54:58 +01:00
Rita Aleksziev	18bb282fbc	Adjust SWE-bench script to code graph pipeline call	2025-01-09 14:52:02 +01:00
vasilije	76a0aa7e8b	Fix linter issues	2025-01-05 19:48:35 +01:00
vasilije	6dafe73a6b	Fix linter issues	2025-01-05 19:24:55 +01:00
vasilije	649fcf2ba8	Fix linter issues	2025-01-05 19:21:09 +01:00
vasilije	60c8fd103b	ruff format	2025-01-05 19:09:08 +01:00
lxobr	da5e3ab24d	COG 870 Remove duplicate edges from the code graph (#293 ) * feat: turn summarize_code into generator * feat: extract run_code_graph_pipeline, update the pipeline * feat: minimal code graph example * refactor: update argument * refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline * refactor: indentation and whitespace nits * refactor: add deprecated use comments and warnings --------- Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com> Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2024-12-17 12:02:25 +01:00
alekszievr	4f2745504c	Calculate official hotpot EM and F1 scores (#292 )	2024-12-10 19:16:12 +01:00
Boris	348610e73c	fix: refactor get_graph_from_model to return nodes and edges correctly (#257 ) * fix: handle rate limit error coming from llm model * fix: fixes lost edges and nodes in get_graph_from_model * fix: fixes database pruning issue in pgvector (#261) * fix: cognee_demo notebook pipeline is not saving summaries --------- Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2024-12-06 12:52:01 +01:00
Boris Arzentar	d49ab4c3b5	feat: update code-graph notebook	2024-12-03 23:48:12 +01:00
Boris Arzentar	b89a4b8054	Merge remote-tracking branch 'origin/main' into code-graph	2024-12-03 21:14:19 +01:00
Rita Aleksziev	a0d5102bd8	add some spaces for readability	2024-12-03 17:22:23 +01:00
Rita Aleksziev	0fbb50960b	prompt renaming	2024-12-03 15:59:03 +01:00
Rita Aleksziev	dc082de4c2	minor bugfix in folder creation	2024-12-02 14:54:40 +01:00
Rita Aleksziev	f966f099fc	Prompt renaming to more specific names. Minor code changes.	2024-12-02 12:18:00 +01:00
Boris Arzentar	11acabdb6a	fix: remove duplicate nodes and edges before saving; Fix FalkorDB vector index;	2024-12-02 10:10:18 +01:00
Rita Aleksziev	a4c56f118d	Connect code graph pipeline + retriever + benchmarking	2024-11-29 15:24:49 +01:00
Rita Aleksziev	4da1657140	merge changes from code-graph	2024-11-29 12:16:36 +01:00
Rita Aleksziev	8f241fa6c5	convert edge to string	2024-11-29 12:05:52 +01:00
Leon Luithlen	a5ae9185cd	Replicate PR 33	2024-11-29 11:40:51 +01:00
Leon Luithlen	d9fc740ec0	Fix merge conflicts	2024-11-29 11:33:05 +01:00
Leon Luithlen	b46af5a6f6	Update eval_swe_bench	2024-11-29 11:31:03 +01:00
Leon Luithlen	618d476c30	Add code formating to usermod command	2024-11-29 11:30:39 +01:00
Leon Luithlen	5036f3a85f	Add -y to setup_ubuntu_instance.sh commands and update EC2_README	2024-11-29 11:30:39 +01:00
Leon Luithlen	1bfa3a0ea3	Rebase onto code-graph	2024-11-29 11:30:30 +01:00
Rita Aleksziev	996b3a658b	add custom metric implementation	2024-11-28 16:53:33 +01:00
Rita Aleksziev	8edfe7c5a4	feat/connect code graph pipeline to benchmarking	2024-11-28 16:52:54 +01:00
Boris Arzentar	2408fd7a01	fix: falkordb adapter errors	2024-11-28 09:12:37 +01:00
Rita Aleksziev	4aa634d5e1	Eval function takes eval_metric as input. Works with deepeval metrics like AnswerRelevancyMetric	2024-11-27 16:14:05 +01:00
Rita Aleksziev	f47b185a9e	feat/add correctness score calculation with LLM as a judge	2024-11-27 10:53:48 +01:00
Boris	64b8aac86f	feat: code graph swe integration Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com> Co-authored-by: hande-k <handekafkas7@gmail.com> Co-authored-by: Igor Ilic <igorilic03@gmail.com> Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com> Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>	2024-11-27 09:32:29 +01:00
Rita Aleksziev	e1d8f3ea86	use acreate_structured_output instead of create_structured_output in eval script	2024-11-20 16:02:15 +01:00
Rita Aleksziev	2948089806	Read patch generation instructions from file	2024-11-19 14:07:53 +01:00
Rita Aleksziev	838d98238a	Code cleanup	2024-11-19 13:54:04 +01:00
Rita Aleksziev	d986e7c981	minor code cleanup	2024-11-18 15:59:18 +01:00
Rita Aleksziev	98e3445c2c	running swebench evaluation as subprocess	2024-11-18 15:12:36 +01:00
Rita Aleksziev	ed08cdb9f9	using the code graph pipeline instead of cognify	2024-11-15 17:56:19 +01:00
Rita Aleksziev	721fde3d60	generating testspecs for data	2024-11-15 17:14:43 +01:00
Rita Aleksziev	094ba7233e	Running inference with and without cognee	2024-11-14 16:28:03 +01:00
Rita Aleksziev	aa95aa21af	downloading example repo for eval	2024-11-12 17:40:42 +01:00

1 2

61 commits