Improve entity extraction prompt clarity and make sure LLM output content only

2025-09-14 17:50:56 +08:00 · 2025-09-14 17:50:56 +08:00 · 92f8fc6fbf
commit 92f8fc6fbf
parent 4dafec8884
1 changed files with 11 additions and 10 deletions
--- a/lightrag/prompt.py
+++ b/lightrag/prompt.py
@ -74,23 +74,24 @@ PROMPTS["entity_extraction_user_prompt"] = """---Task---
 Extract entities and relationships from the input text to be processed.
 ---Instructions---
-1. Adhere strictly to the format requirements for entity and relationship lists as specified in the system prompts.
+1.  **Strict Adherence to Format:** Strictly adhere to all format requirements for entity and relationship lists, including output order, field delimiters, and proper noun handling, as specified in the system prompt.
-2. Output `{completion_delimiter}` only after all relevant entities and relationships have been extracted.
+2.  **Output Content Only:** Output *only* the extracted list of entities and relationships. Do not include any introductory or concluding remarks, explanations, or additional text before or after the list.
-3. Ensure the output language is {language}. Proper nouns (e.g., personal names, place names, organization names) must be kept in their original language and not translated.
+3.  **Completion Signal:** Output `{completion_delimiter}` as the final line after all relevant entities and relationships have been extracted and presented.
 <Output>
 """
 PROMPTS["entity_continue_extraction_user_prompt"] = """---Task---
-Identify any missed entities or relationships from the input text to be processed based on the last extraction task.
+Based on the last extraction task, identify and extract any **missed or incorrectly formatted** entities and relationships from the input text.
 ---Instructions---
-1. Adhere strictly to the format requirements for entity and relationship lists as specified in the system prompts.
+1.  **Strict Adherence to System Format:** Strictly adhere to all format requirements for entity and relationship lists, including output order, field delimiters, and proper noun handling, as specified in the system instructions.
-2. Do not include entities and relationships that were correctly extracted in the last extraction task.
+2.  **Focus on Corrections/Additions:**
-3. If an entity or relationship output was truncated or had missing fields in the last extraction task, please re-output it in the correct format.
+    *   **Do NOT** re-output entities and relationships that were **correctly and fully** extracted in the last task.
-4. Output each entity and relationship on a single line; use `{tuple_delimiter}` as the field separator within each extracted item.
+    *   If an entity or relationship was **missed** in the last task, extract and output it now according to the system format.
-5. Output `{completion_delimiter}` only after all relevant entities and relationships have been extracted.
+    *   If an entity or relationship was **truncated, had missing fields, or was otherwise incorrectly formatted** in the last task, re-output the *corrected and complete* version in the specified format.
-6. Ensure the output language is {language}. Proper nouns (e.g., personal names, place names, organization names) may in their original language if proper translation is not available.
+3.  **Output Content Only:** Output *only* the extracted list of entities and relationships. Do not include any introductory or concluding remarks, explanations, or additional text before or after the list.
 4.  **Completion Signal:** Output `{completion_delimiter}` as the final line after all relevant missing or corrected entities and relationships have been extracted and presented.
 <Output>
 """