cognee

1922 commits 174 branches 76 tags 146 MiB

Author	SHA1	Message	Date
alekszievr	3494521cae	Support 4 different rag options in eval (#439 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage	2025-01-15 15:34:13 +01:00
alekszievr	6653d73556	Feat/cog 950 improve metric selection (#435 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Minor refactor and logger usage	2025-01-15 10:45:55 +01:00
alekszievr	a4ad1702ed	Feat/cog 946 abstract eval dataset (#418 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * Use requests.get instead of wget	2025-01-14 11:33:55 +01:00
vasilije	60c8fd103b	ruff format	2025-01-05 19:09:08 +01:00
alekszievr	4f2745504c	Calculate official hotpot EM and F1 scores (#292 )	2024-12-10 19:16:12 +01:00

Renamed from evals/llm_as_a_judge.py (Browse further)

5 commits