<!-- .github/pull_request_template.md -->
## Description
- Enable custom tasks in corpus building
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Introduced a configurable option to specify the task retrieval
strategy during corpus building.
- Enhanced the workflow with integrated task fetching, featuring a
default retrieval mechanism.
- Updated evaluation configuration to support customizable task
selection for more flexible operations.
- Added a new abstract base class for defining various task retrieval
strategies.
- Introduced a new enumeration to map task getter types to their
corresponding classes.
- **Dependencies**
- Added a new dependency for downloading files from Google Drive.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
This PR contains the evaluation framework development for cognee
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Expanded evaluation framework now integrates asynchronous corpus
building, question answering, and performance evaluation with adaptive
benchmarks for improved metrics (correctness, exact match, and F1
score).
- **Infrastructure**
- Added database integration for persistent storage of questions,
answers, and metrics.
- Launched an interactive metrics dashboard featuring advanced
visualizations.
- Introduced an automated testing workflow for continuous quality
assurance.
- **Documentation**
- Updated guidelines for generating concise, clear answers.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
• Graph visualizations now allow exporting to a user-specified file path
for more flexible output management.
• The text embedding process has been enhanced with an additional
tokenizer option for improved performance.
• A new `ExtendableDataPoint` class has been introduced for future
extensions.
• New JSON files for companies and individuals have been added to
facilitate testing and data processing.
- **Improvements**
• Search functionality now uses updated identifiers for more reliable
content retrieval.
• Metadata handling has been streamlined across various classes by
removing unnecessary type specifications.
• Enhanced serialization of properties in the Neo4j adapter for improved
handling of complex structures.
• The setup process for databases has been improved with a new
asynchronous setup function.
- **Chores**
• Dependency and configuration updates improve overall stability and
performance.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
Refactor search so query type doesn't need to be provided to make it
simpler for new users
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Refactor**
- Improved the search interface by standardizing parameter usage with
explicit keyword arguments for specifying search types, enhancing
clarity and consistency.
- **Tests**
- Updated test cases and example integrations to align with the revised
search parameters, ensuring consistent behavior and reliable validation
of search outcomes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Dependency Update**
- Downgraded `mcp` package version from 1.2.0 to 1.1.3
- Updated `cognee` dependency to include additional features with
`cognee[codegraph]`
- **New Features**
- Introduced a new tool, "codify", for transforming codebases into
knowledge graphs
- Enhanced the existing "search" tool to accept a new parameter for
search type
- **Improvements**
- Streamlined search functionality with a new modular approach
- Added new asynchronous function for retrieving and formatting code
parts
- **Documentation**
- Updated import paths for `SearchType` in various modules and tests to
reflect structural changes
- **Code Cleanup**
- Removed legacy search module and associated classes/functions
- Refined data transfer object classes for consistency and clarity
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
<!-- .github/pull_request_template.md -->
## Description
PR to test Gemini PR from holchan
1. Add Gemini LLM and Gemini Embedding support
2. Fix CodeGraph issue with chunks being bigger than maximum token value
3. Add Tokenizer adapters to CodeGraph
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Added support for the Gemini LLM provider.
- Expanded LLM configuration options.
- Introduced a new GitHub Actions workflow for multimetric QA
evaluation.
- Added new environment variables for LLM and embedding configurations
across various workflows.
- **Bug Fixes**
- Improved error handling in various components.
- Updated tokenization and embedding processes.
- Removed warning related to missing `dict` method in data items.
- **Refactor**
- Simplified token extraction and decoding methods.
- Updated tokenizer interfaces.
- Removed deprecated dependencies.
- Enhanced retry logic and error handling in embedding processes.
- **Documentation**
- Updated configuration comments and settings.
- **Chores**
- Updated GitHub Actions workflows to accommodate new secrets and
environment variables.
- Modified evaluation parameters.
- Adjusted dependency management for optional libraries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: holchan <61059652+holchan@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
* feat: make tasks a configurable argument in the cognify function
* fix: add data points task
* Ugly hack for multi-metric eval bug
* some cleanup
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
* feat: make tasks a configurable argument in the cognify function
* fix: add data points task
* eval on random samples instead of first couple
* Save and load contexts and answers
* Fix random seed usage and handle empty descriptions
* include insights search in cognee option
* create output dir if doesnt exist
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Support 4 different rag options in eval
* Minor refactor and logger usage
* feat: make tasks a configurable argument in the cognify function
* Run eval on a set of parameters and save results as json and png
* fix: add data points task
* script for running all param combinations
* enable context provider to get tasks as param
* bugfix in simple rag
* Incremental eval of cognee pipeline
* potential fix: single asyncio run
* temp fix: exclude insights
* Remove insights, have single asyncio run, refactor
* Include incremental eval in accepted paramsets
* minor fixes
* handle pipeline slices in utils
* Handle insights and customize search types
* Handle retrieved edges more safely
* bugfix
* fix simple rag
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Support 4 different rag options in eval
* Minor refactor and logger usage
* feat: make tasks a configurable argument in the cognify function
* Run eval on a set of parameters and save results as json and png
* fix: add data points task
* script for running all param combinations
* enable context provider to get tasks as param
* bugfix in simple rag
* Incremental eval of cognee pipeline
* potential fix: single asyncio run
* temp fix: exclude insights
* Remove insights, have single asyncio run, refactor
* minor fixes
* handle pipeline slices in utils
* include all options in params json
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Support 4 different rag options in eval
* Minor refactor and logger usage
* Run eval on a set of parameters and save results as json and png
* script for running all param combinations
* bugfix in simple rag
* potential fix: single asyncio run
* temp fix: exclude insights
* Remove insights, have single asyncio run, refactor
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Support 4 different rag options in eval
* Minor refactor and logger usage
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* restructure metric selection
* Add comprehensiveness, diversity and empowerment metrics
* add promptfoo as an option
* refactor RAG solution in eval;2C
* LLM as a judge metrics implemented in a uniform way
* Use requests.get instead of wget
* clean up promptfoo config template
* minor fixes
* get promptfoo path instead of hardcoding
* minor fixes
* Add LLM as a judge prompts
* Minor refactor and logger usage
* QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets.
* Load dataset file by filename, outsource utilities
* Use requests.get instead of wget