Optimize for OpenAI Prompt Caching: Restructure entity extraction prompts

- Remove input_text from entity_extraction_system_prompt to enable caching - Move input_text to entity_extraction_user_prompt for per-chunk variability - Update operate.py to format system prompt once without input_text - Format user prompts with input_text for each chunk This enables OpenAI's automatic prompt caching (50% discount on cached tokens): - ~1300 token system message cached and reused for ALL chunks - Only ~150 token user message varies per chunk - Expected 45% cost reduction on prompt tokens during indexing - 2-3x faster response times from cached prompts Fixes #2355
2025-11-26 21:56:25 +00:00 · 2025-11-26 21:56:25 +00:00 · 207af40f54
commit 207af40f54
parent 93d445dfdd
2 changed files with 9 additions and 5 deletions
--- a/lightrag/operate.py
+++ b/lightrag/operate.py
@ -2832,9 +2832,11 @@ async def extract_entities(
        cache_keys_collector = []

        # Get initial extraction
+        # Format system prompt once without input_text for OpenAI prompt caching
        entity_extraction_system_prompt = PROMPTS[
            "entity_extraction_system_prompt"
-        ].format(**{**context_base, "input_text": content})
+        ].format(**context_base)
+        # Format user prompts with input_text for each chunk
        entity_extraction_user_prompt = PROMPTS["entity_extraction_user_prompt"].format(
            **{**context_base, "input_text": content}
        )
--- a/lightrag/prompt.py
+++ b/lightrag/prompt.py
@ -62,14 +62,16 @@ You are a Knowledge Graph Specialist responsible for extracting entities and rel
 ---Real Data to be Processed---
 <Input>
 Entity_types: [{entity_types}]
+"""
+
+PROMPTS["entity_extraction_user_prompt"] = """---Task---
+Extract entities and relationships from the following input text.
+
+---Input Text---
 Text:
 ```
 {input_text}
 ```
-"""
-
-PROMPTS["entity_extraction_user_prompt"] = """---Task---
-Extract entities and relationships from the input text to be processed.

 ---Instructions---
 1.  **Strict Adherence to Format:** Strictly adhere to all format requirements for entity and relationship lists, including output order, field delimiters, and proper noun handling, as specified in the system prompt.