cognee/evals/eval_framework/benchmark_adapters/twowikimultihop_adapter.py
alekszievr 17231de5d0
Test: Parse context pieces separately in MusiqueQAAdapter and adjust tests [cog-1234] (#561)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Tests**
- Updated evaluation checks by removing assertions related to the
relationship between `corpus_list` and `qa_pairs`, now focusing solely
on `qa_pairs` limits.

- **Refactor**
- Improved content processing to append each paragraph individually to
`corpus_list`, enhancing clarity in data structure.
- Simplified type annotations in the `load_corpus` method across
multiple adapters, ensuring consistency in return types.

- **Chores**
- Updated dependency installation commands in GitHub Actions workflows
for Python 3.10, 3.11, and 3.12 to include additional evaluation-related
dependencies.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-02-20 14:23:53 +01:00

49 lines
1.6 KiB
Python

import requests
import os
import json
import random
from typing import Optional, Any
from evals.eval_framework.benchmark_adapters.base_benchmark_adapter import BaseBenchmarkAdapter
class TwoWikiMultihopAdapter(BaseBenchmarkAdapter):
dataset_info = {
"filename": "2wikimultihop_dev.json",
"URL": "https://huggingface.co/datasets/voidful/2WikiMultihopQA/resolve/main/dev.json",
}
def load_corpus(
self, limit: Optional[int] = None, seed: int = 42
) -> tuple[list[str], list[dict[str, Any]]]:
filename = self.dataset_info["filename"]
if os.path.exists(filename):
with open(filename, "r", encoding="utf-8") as f:
corpus_json = json.load(f)
else:
response = requests.get(self.dataset_info["URL"])
response.raise_for_status()
corpus_json = response.json()
with open(filename, "w", encoding="utf-8") as f:
json.dump(corpus_json, f, ensure_ascii=False, indent=4)
if limit is not None and 0 < limit < len(corpus_json):
random.seed(seed)
corpus_json = random.sample(corpus_json, limit)
corpus_list = []
question_answer_pairs = []
for dict in corpus_json:
for title, sentences in dict["context"]:
corpus_list.append(" ".join(sentences))
question_answer_pairs.append(
{
"question": dict["question"],
"answer": dict["answer"].lower(),
"type": dict["type"],
}
)
return corpus_list, question_answer_pairs