cognee/evals/old/comparative_eval/README.md
Vasilije f65605b575
fix: Feature/cog 2648 evals update (#1221)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: Hande <159312713+hande-k@users.noreply.github.com>
2025-08-08 20:23:09 +02:00

40 lines
885 B
Markdown

# Comparative QA Benchmarks
Independent benchmarks for different QA/RAG systems using HotpotQA dataset.
## Dataset Files
- `hotpot_50_corpus.json` - 50 instances from HotpotQA
- `hotpot_50_qa_pairs.json` - Corresponding question-answer pairs
## Benchmarks
Each benchmark can be run independently with appropriate dependencies:
### Mem0
```bash
pip install mem0ai openai
python qa_benchmark_mem0.py
```
### LightRAG
```bash
pip install "lightrag-hku[api]"
python qa_benchmark_lightrag.py
```
### Graphiti
```bash
pip install graphiti-core
python qa_benchmark_graphiti.py
```
## Environment
Create `.env` with required API keys:
- `OPENAI_API_KEY` (all benchmarks)
- `NEO4J_URI`, `NEO4J_USER`, `NEO4J_PASSWORD` (Graphiti only)
## Usage
Each benchmark inherits from `QABenchmarkRAG` base class and can be configured independently.
# Results
Updated results will be posted soon.