diff --git a/lightrag/evaluation/README.md b/lightrag/evaluation/README.md
index f36e2fa7..8848f29d 100644
--- a/lightrag/evaluation/README.md
+++ b/lightrag/evaluation/README.md
@@ -1,12 +1,8 @@
-# 📊 LightRAG Evaluation Framework
-
-RAGAS-based offline evaluation of your LightRAG system.
+# 📊 RAGAS-based Evaluation Framework
 
 ## What is RAGAS?
 
-**RAGAS** (Retrieval Augmented Generation Assessment) is a framework for reference-free evaluation of RAG systems using LLMs.
-
-Instead of requiring human-annotated ground truth, RAGAS uses state-of-the-art evaluation metrics:
+**RAGAS** (Retrieval Augmented Generation Assessment) is a framework for reference-free evaluation of RAG systems using LLMs. RAGAS uses state-of-the-art evaluation metrics:
 
 ### Core Metrics
 
@@ -18,9 +14,7 @@ Instead of requiring human-annotated ground truth, RAGAS uses state-of-the-art e
 | **Context Precision** | Is retrieved context clean without irrelevant noise? | > 0.80 |
 | **RAGAS Score** | Overall quality metric (average of above) | > 0.80 |
 
----
-
-## 📁 Structure
+### 📁 LightRAG Evalua'tion Framework Directory Structure
 
 ```
 lightrag/evaluation/
@@ -42,7 +36,7 @@ lightrag/evaluation/
 
 **Quick Test:** Index files from `sample_documents/` into LightRAG, then run the evaluator to reproduce results (~89-100% RAGAS score per question).
 
----
+
 
 ## 🚀 Quick Start
 
@@ -55,7 +49,7 @@ pip install ragas datasets langfuse
 Or use your project dependencies (already included in pyproject.toml):
 
 ```bash
-pip install -e ".[offline-llm]"
+pip install -e ".[evaluation]"
 ```
 
 ### 2. Run Evaluation
@@ -102,7 +96,7 @@ results/
 - 📋 Individual test case results
 - 📈 Performance breakdown by question
 
----
+
 
 ## 📋 Command-Line Arguments
 
@@ -145,7 +139,7 @@ python lightrag/evaluation/eval_rag_quality.py -d /path/to/custom_dataset.json
 python lightrag/evaluation/eval_rag_quality.py --help
 ```
 
----
+
 
 ## ⚙️ Configuration
 
@@ -214,7 +208,7 @@ EVAL_LLM_TIMEOUT=180     # 3-minute timeout per request
 | **Rate limit errors (429)** | Increase `EVAL_LLM_MAX_RETRIES` and decrease `EVAL_MAX_CONCURRENT` |
 | **Request timeouts** | Increase `EVAL_LLM_TIMEOUT` to 180 or higher |
 
----
+
 
 ## 📝 Test Dataset
 
@@ -228,7 +222,7 @@ EVAL_LLM_TIMEOUT=180     # 3-minute timeout per request
     {
       "question": "Your question here",
       "ground_truth": "Expected answer from your data",
-      "context": "topic"
+      "project": "evaluation_project_name"
     }
   ]
 }
@@ -346,11 +340,10 @@ cd /path/to/LightRAG
 python lightrag/evaluation/eval_rag_quality.py
 ```
 
-### "LLM API errors during evaluation"
+### "LightRAG query API errors during evaluation"
 
 The evaluation uses your configured LLM (OpenAI by default). Ensure:
 - API keys are set in `.env`
-- Have sufficient API quota
 - Network connection is stable
 
 ### Evaluation requires running LightRAG API
@@ -360,15 +353,14 @@ The evaluator queries a running LightRAG API server at `http://localhost:9621`.
 2. Documents are indexed in your LightRAG instance
 3. API is accessible at the configured URL
 
----
+
 
 ## 📝 Next Steps
 
-1. Index sample documents into LightRAG (WebUI or API)
-2. Start LightRAG API server
+1. Start LightRAG API server
+2. Upload sample documents into LightRAG  throught  WebUI
 3. Run `python lightrag/evaluation/eval_rag_quality.py`
 4. Review results (JSON/CSV) in `results/` folder
-5. Adjust entity extraction prompts or retrieval settings based on scores
 
 Evaluation Result Sample: