Enforce shorter summaries with 8 sentence limit (#978)

* Enforce shorter summaries with 8 sentence limit

Replace 250-word limit with 8 sentence limit for node summaries to improve conciseness. Also update prompt system message for summarize_context to better reflect its dual purpose of generating summaries and attributes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update graphiti_core/prompts/summarize_nodes.py

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

* Bump version to 0.22.0pre1

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Update graphiti_core/prompts/summarize_nodes.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
This commit is contained in:
Daniel Chalef 2025-10-04 14:37:16 -07:00 committed by GitHub
parent 2864786dd9
commit 8a78633e2f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 61 additions and 53 deletions

View file

@ -23,39 +23,44 @@ from .prompt_helpers import to_prompt_json
class ExtractedEntity(BaseModel):
name: str = Field(..., description='Name of the extracted entity')
name: str = Field(..., description="Name of the extracted entity")
entity_type_id: int = Field(
description='ID of the classified entity type. '
'Must be one of the provided entity_type_id integers.',
description="ID of the classified entity type. "
"Must be one of the provided entity_type_id integers.",
)
class ExtractedEntities(BaseModel):
extracted_entities: list[ExtractedEntity] = Field(..., description='List of extracted entities')
extracted_entities: list[ExtractedEntity] = Field(
..., description="List of extracted entities"
)
class MissedEntities(BaseModel):
missed_entities: list[str] = Field(..., description="Names of entities that weren't extracted")
missed_entities: list[str] = Field(
..., description="Names of entities that weren't extracted"
)
class EntityClassificationTriple(BaseModel):
uuid: str = Field(description='UUID of the entity')
name: str = Field(description='Name of the entity')
uuid: str = Field(description="UUID of the entity")
name: str = Field(description="Name of the entity")
entity_type: str | None = Field(
default=None, description='Type of the entity. Must be one of the provided types or None'
default=None,
description="Type of the entity. Must be one of the provided types or None",
)
class EntityClassification(BaseModel):
entity_classifications: list[EntityClassificationTriple] = Field(
..., description='List of entities classification triples.'
..., description="List of entities classification triples."
)
class EntitySummary(BaseModel):
summary: str = Field(
...,
description='Summary containing the important information about the entity. Under 250 words',
description="Summary containing the important information about the entity. Under 8 sentences.",
)
@ -123,8 +128,8 @@ reference entities. Only extract distinct entities from the CURRENT MESSAGE. Don
{context['custom_prompt']}
"""
return [
Message(role='system', content=sys_prompt),
Message(role='user', content=user_prompt),
Message(role="system", content=sys_prompt),
Message(role="user", content=user_prompt),
]
@ -156,8 +161,8 @@ Guidelines:
3. Do NOT extract any properties that contain dates
"""
return [
Message(role='system', content=sys_prompt),
Message(role='user', content=user_prompt),
Message(role="system", content=sys_prompt),
Message(role="user", content=user_prompt),
]
@ -187,8 +192,8 @@ Guidelines:
4. Be as explicit as possible in your node names, using full names and avoiding abbreviations.
"""
return [
Message(role='system', content=sys_prompt),
Message(role='user', content=user_prompt),
Message(role="system", content=sys_prompt),
Message(role="user", content=user_prompt),
]
@ -211,8 +216,8 @@ Given the above previous messages, current message, and list of extracted entiti
extracted.
"""
return [
Message(role='system', content=sys_prompt),
Message(role='user', content=user_prompt),
Message(role="system", content=sys_prompt),
Message(role="user", content=user_prompt),
]
@ -243,19 +248,19 @@ def classify_nodes(context: dict[str, Any]) -> list[Message]:
3. If none of the provided entity types accurately classify an extracted node, the type should be set to None
"""
return [
Message(role='system', content=sys_prompt),
Message(role='user', content=user_prompt),
Message(role="system", content=sys_prompt),
Message(role="user", content=user_prompt),
]
def extract_attributes(context: dict[str, Any]) -> list[Message]:
return [
Message(
role='system',
content='You are a helpful assistant that extracts entity properties from the provided text.',
role="system",
content="You are a helpful assistant that extracts entity properties from the provided text.",
),
Message(
role='user',
role="user",
content=f"""
<MESSAGES>
@ -281,11 +286,11 @@ def extract_attributes(context: dict[str, Any]) -> list[Message]:
def extract_summary(context: dict[str, Any]) -> list[Message]:
return [
Message(
role='system',
content='You are a helpful assistant that extracts entity summaries from the provided text.',
role="system",
content="You are a helpful assistant that extracts entity summaries from the provided text.",
),
Message(
role='user',
role="user",
content=f"""
<MESSAGES>
@ -300,7 +305,7 @@ def extract_summary(context: dict[str, Any]) -> list[Message]:
1. Do not hallucinate entity summary information if they cannot be found in the current context.
2. Only use the provided MESSAGES and ENTITY to set attribute values.
3. The summary attribute represents a summary of the ENTITY, and should be updated with new information about the Entity from the MESSAGES.
Summaries must be no longer than 250 words.
4. Keep the summary concise and to the point. SUMMARIES MUST BE LESS THAN 8 SENTENCES.
<ENTITY>
{context['node']}
@ -311,11 +316,11 @@ def extract_summary(context: dict[str, Any]) -> list[Message]:
versions: Versions = {
'extract_message': extract_message,
'extract_json': extract_json,
'extract_text': extract_text,
'reflexion': reflexion,
'extract_summary': extract_summary,
'classify_nodes': classify_nodes,
'extract_attributes': extract_attributes,
"extract_message": extract_message,
"extract_json": extract_json,
"extract_text": extract_text,
"reflexion": reflexion,
"extract_summary": extract_summary,
"classify_nodes": classify_nodes,
"extract_attributes": extract_attributes,
}

View file

@ -25,12 +25,14 @@ from .prompt_helpers import to_prompt_json
class Summary(BaseModel):
summary: str = Field(
...,
description='Summary containing the important information about the entity. Under 250 words',
description="Summary containing the important information about the entity. Under 8 sentences",
)
class SummaryDescription(BaseModel):
description: str = Field(..., description='One sentence description of the provided summary')
description: str = Field(
..., description="One sentence description of the provided summary"
)
class Prompt(Protocol):
@ -48,15 +50,15 @@ class Versions(TypedDict):
def summarize_pair(context: dict[str, Any]) -> list[Message]:
return [
Message(
role='system',
content='You are a helpful assistant that combines summaries.',
role="system",
content="You are a helpful assistant that combines summaries.",
),
Message(
role='user',
role="user",
content=f"""
Synthesize the information from the following two summaries into a single succinct summary.
Summaries must be under 250 words.
IMPORTANT: Keep the summary concise and to the point. SUMMARIES MUST BE LESS THAN 8 SENTENCES.
Summaries:
{to_prompt_json(context['node_summaries'], indent=2)}
@ -68,11 +70,11 @@ def summarize_pair(context: dict[str, Any]) -> list[Message]:
def summarize_context(context: dict[str, Any]) -> list[Message]:
return [
Message(
role='system',
content='You are a helpful assistant that extracts entity properties from the provided text.',
role="system",
content="You are a helpful assistant that generates a summary and attributes from provided text.",
),
Message(
role='user',
role="user",
content=f"""
<MESSAGES>
@ -82,7 +84,7 @@ def summarize_context(context: dict[str, Any]) -> list[Message]:
Given the above MESSAGES and the following ENTITY name, create a summary for the ENTITY. Your summary must only use
information from the provided MESSAGES. Your summary should also only contain information relevant to the
provided ENTITY. Summaries must be under 250 words.
provided ENTITY.
In addition, extract any values for the provided entity properties based on their descriptions.
If the value of the entity property cannot be found in the current context, set the value of the property to the Python value None.
@ -90,6 +92,7 @@ def summarize_context(context: dict[str, Any]) -> list[Message]:
Guidelines:
1. Do not hallucinate entity property values if they cannot be found in the current context.
2. Only use the provided messages, entity, and entity context to set attribute values.
3. Keep the summary concise and to the point. SUMMARIES MUST BE LESS THAN 8 SENTENCES.
<ENTITY>
{context['node_name']}
@ -110,14 +113,14 @@ def summarize_context(context: dict[str, Any]) -> list[Message]:
def summary_description(context: dict[str, Any]) -> list[Message]:
return [
Message(
role='system',
content='You are a helpful assistant that describes provided contents in a single sentence.',
role="system",
content="You are a helpful assistant that describes provided contents in a single sentence.",
),
Message(
role='user',
role="user",
content=f"""
Create a short one sentence description of the summary that explains what kind of information is summarized.
Summaries must be under 250 words.
Summaries must be under 8 sentences.
Summary:
{to_prompt_json(context['summary'], indent=2)}
@ -127,7 +130,7 @@ def summary_description(context: dict[str, Any]) -> list[Message]:
versions: Versions = {
'summarize_pair': summarize_pair,
'summarize_context': summarize_context,
'summary_description': summary_description,
"summarize_pair": summarize_pair,
"summarize_context": summarize_context,
"summary_description": summary_description,
}

View file

@ -1,7 +1,7 @@
[project]
name = "graphiti-core"
description = "A temporal graph building library"
version = "0.22.0pre0"
version = "0.22.0pre1"
authors = [
{ name = "Paul Paliychuk", email = "paul@getzep.com" },
{ name = "Preston Rasmussen", email = "preston@getzep.com" },

2
uv.lock generated
View file

@ -783,7 +783,7 @@ wheels = [
[[package]]
name = "graphiti-core"
version = "0.21.0"
version = "0.22.0rc0"
source = { editable = "." }
dependencies = [
{ name = "diskcache" },