From ac33cf693dffa02caba2212e4c1aaac1888aeaf0 Mon Sep 17 00:00:00 2001 From: yangdx Date: Tue, 19 Aug 2025 15:07:40 +0800 Subject: [PATCH] Refactor keyword extraction rules and remove overlap constraint MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit • Require content in both keyword categories • Remove no-overlap rule between lists • Simplify edge case handling • Clarify source of truth requirement --- lightrag/prompt.py | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/lightrag/prompt.py b/lightrag/prompt.py index e47b91ee..32666bb5 100644 --- a/lightrag/prompt.py +++ b/lightrag/prompt.py @@ -246,10 +246,9 @@ Given a user query, your task is to extract two distinct types of keywords: ---Instructions & Constraints--- 1. **Output Format**: Your output MUST be a valid JSON object and nothing else. Do not include any explanatory text, markdown code fences (like ```json), or any other text before or after the JSON. It will be parsed directly by a JSON parser. -2. **Source of Truth**: All keywords must be derived directly from or be a direct interpretation of the user query. +2. **Source of Truth**: All keywords must be explicitly derived from the user query, with both high-level and low-level keyword categories required to contain content. 3. **Concise & Meaningful**: Keywords should be concise words or meaningful phrases. Prioritize multi-word phrases when they represent a single concept. For example, from "latest financial report of Apple Inc.", you should extract "latest financial report" and "Apple Inc." rather than "latest", "financial", "report", and "Apple". -4. **No Overlap**: A keyword or its core concept should not appear in both the high-level and low-level lists. -5. **Handle Edge Cases**: For queries that are too simple, vague, or nonsensical (e.g., "hello", "ok", "asdfghjkl"), you must return a JSON object with empty lists for both keyword types. +4. **Handle Edge Cases**: For queries that are too simple, vague, or nonsensical (e.g., "hello", "ok", "asdfghjkl"), you must return a JSON object with empty lists for both keyword types. ---Examples--- {examples}