<!-- .github/pull_request_template.md -->
## Description
PR to test Gemini PR from holchan
1. Add Gemini LLM and Gemini Embedding support
2. Fix CodeGraph issue with chunks being bigger than maximum token value
3. Add Tokenizer adapters to CodeGraph
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Added support for the Gemini LLM provider.
- Expanded LLM configuration options.
- Introduced a new GitHub Actions workflow for multimetric QA
evaluation.
- Added new environment variables for LLM and embedding configurations
across various workflows.
- **Bug Fixes**
- Improved error handling in various components.
- Updated tokenization and embedding processes.
- Removed warning related to missing `dict` method in data items.
- **Refactor**
- Simplified token extraction and decoding methods.
- Updated tokenizer interfaces.
- Removed deprecated dependencies.
- Enhanced retry logic and error handling in embedding processes.
- **Documentation**
- Updated configuration comments and settings.
- **Chores**
- Updated GitHub Actions workflows to accommodate new secrets and
environment variables.
- Modified evaluation parameters.
- Adjusted dependency management for optional libraries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: holchan <61059652+holchan@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
* feat: make tasks a configurable argument in the cognify function
* fix: add data points task
* Ugly hack for multi-metric eval bug
* some cleanup
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
* feat: make tasks a configurable argument in the cognify function
* fix: add data points task
* eval on random samples instead of first couple
* Save and load contexts and answers
* Fix random seed usage and handle empty descriptions
* include insights search in cognee option
* create output dir if doesnt exist
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Support 4 different rag options in eval
* Minor refactor and logger usage
* feat: make tasks a configurable argument in the cognify function
* Run eval on a set of parameters and save results as json and png
* fix: add data points task
* script for running all param combinations
* enable context provider to get tasks as param
* bugfix in simple rag
* Incremental eval of cognee pipeline
* potential fix: single asyncio run
* temp fix: exclude insights
* Remove insights, have single asyncio run, refactor
* Include incremental eval in accepted paramsets
* minor fixes
* handle pipeline slices in utils
* Handle insights and customize search types
* Handle retrieved edges more safely
* bugfix
* fix simple rag
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Support 4 different rag options in eval
* Minor refactor and logger usage
* feat: make tasks a configurable argument in the cognify function
* Run eval on a set of parameters and save results as json and png
* fix: add data points task
* script for running all param combinations
* enable context provider to get tasks as param
* bugfix in simple rag
* Incremental eval of cognee pipeline
* potential fix: single asyncio run
* temp fix: exclude insights
* Remove insights, have single asyncio run, refactor
* minor fixes
* handle pipeline slices in utils
* include all options in params json
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Support 4 different rag options in eval
* Minor refactor and logger usage
* Run eval on a set of parameters and save results as json and png
* script for running all param combinations
* bugfix in simple rag
* potential fix: single asyncio run
* temp fix: exclude insights
* Remove insights, have single asyncio run, refactor
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Support 4 different rag options in eval
* Minor refactor and logger usage
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Minor refactor and logger usage
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* Use requests.get instead of wget