Compare commits
3 commits
main
...
codex/revi
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ad384372a7 | ||
|
|
23511f3b5e | ||
|
|
e40fe556d5 |
2 changed files with 42 additions and 5653 deletions
|
|
@ -92,12 +92,23 @@ def node(context: dict[str, Any]) -> list[Message]:
|
|||
|
||||
TASK:
|
||||
1. Compare `new_entity` against each item in `existing_entities`.
|
||||
2. If it refers to the same real‐world object or concept, collect its index.
|
||||
3. Let `duplicate_idx` = the *first* collected index, or –1 if none.
|
||||
4. Let `duplicates` = the list of *all* collected indices (empty list if none).
|
||||
|
||||
Also return the full name of the NEW ENTITY (whether it is the name of the NEW ENTITY, a node it
|
||||
is a duplicate of, or a combination of the two).
|
||||
2. If it refers to the same real-world object or concept, collect its index.
|
||||
3. Let `duplicate_idx` = the smallest collected index, or -1 if none.
|
||||
4. Let `duplicates` = the sorted list of all collected indices (empty list if none).
|
||||
|
||||
Respond with a JSON object containing an "entity_resolutions" array with a single entry:
|
||||
{{
|
||||
"entity_resolutions": [
|
||||
{{
|
||||
"id": integer id from NEW ENTITY,
|
||||
"name": the best full name for the entity,
|
||||
"duplicate_idx": integer index of the best duplicate in EXISTING ENTITIES, or -1 if none,
|
||||
"duplicates": sorted list of all duplicate indices you collected (deduplicate the list, use [] when none)
|
||||
}}
|
||||
]
|
||||
}}
|
||||
|
||||
Only reference indices that appear in EXISTING ENTITIES, and return [] / -1 when unsure.
|
||||
""",
|
||||
),
|
||||
]
|
||||
|
|
@ -126,26 +137,26 @@ def nodes(context: dict[str, Any]) -> list[Message]:
|
|||
{{
|
||||
id: integer id of the entity,
|
||||
name: "name of the entity",
|
||||
entity_type: "ontological classification of the entity",
|
||||
entity_type_description: "Description of what the entity type represents",
|
||||
duplication_candidates: [
|
||||
{{
|
||||
idx: integer index of the candidate entity,
|
||||
name: "name of the candidate entity",
|
||||
entity_type: "ontological classification of the candidate entity",
|
||||
...<additional attributes>
|
||||
}}
|
||||
]
|
||||
entity_type: ["Entity", "<optional additional label>", ...],
|
||||
entity_type_description: "Description of what the entity type represents"
|
||||
}}
|
||||
|
||||
|
||||
<ENTITIES>
|
||||
{to_prompt_json(context['extracted_nodes'], ensure_ascii=context.get('ensure_ascii', True), indent=2)}
|
||||
</ENTITIES>
|
||||
|
||||
|
||||
<EXISTING ENTITIES>
|
||||
{to_prompt_json(context['existing_nodes'], ensure_ascii=context.get('ensure_ascii', True), indent=2)}
|
||||
</EXISTING ENTITIES>
|
||||
|
||||
Each entry in EXISTING ENTITIES is an object with the following structure:
|
||||
{{
|
||||
idx: integer index of the candidate entity (use this when referencing a duplicate),
|
||||
name: "name of the candidate entity",
|
||||
entity_types: ["Entity", "<optional additional label>", ...],
|
||||
...<additional attributes such as summaries or metadata>
|
||||
}}
|
||||
|
||||
For each of the above ENTITIES, determine if the entity is a duplicate of any of the EXISTING ENTITIES.
|
||||
|
||||
Entities should only be considered duplicates if they refer to the *same real-world object or concept*.
|
||||
|
|
@ -155,14 +166,19 @@ def nodes(context: dict[str, Any]) -> list[Message]:
|
|||
- They have similar names or purposes but refer to separate instances or concepts.
|
||||
|
||||
Task:
|
||||
Your response will be a list called entity_resolutions which contains one entry for each entity.
|
||||
|
||||
For each entity, return the id of the entity as id, the name of the entity as name, and the duplicate_idx
|
||||
as an integer.
|
||||
|
||||
- If an entity is a duplicate of one of the EXISTING ENTITIES, return the idx of the candidate it is a
|
||||
duplicate of.
|
||||
- If an entity is not a duplicate of one of the EXISTING ENTITIES, return the -1 as the duplication_idx
|
||||
Respond with a JSON object that contains an "entity_resolutions" array with one entry for each entity in ENTITIES, ordered by the entity id.
|
||||
|
||||
For every entity, return an object with the following keys:
|
||||
{{
|
||||
"id": integer id from ENTITIES,
|
||||
"name": the best full name for the entity (preserve the original name unless a duplicate has a more complete name),
|
||||
"duplicate_idx": the idx of the EXISTING ENTITY that is the best duplicate match, or -1 if there is no duplicate,
|
||||
"duplicates": a sorted list of all idx values from EXISTING ENTITIES that refer to duplicates (deduplicate the list, use [] when none or unsure)
|
||||
}}
|
||||
|
||||
- Only use idx values that appear in EXISTING ENTITIES.
|
||||
- Set duplicate_idx to the smallest idx you collected for that entity, or -1 if duplicates is empty.
|
||||
- Never fabricate entities or indices.
|
||||
""",
|
||||
),
|
||||
]
|
||||
|
|
|
|||
5627
poetry.lock
generated
5627
poetry.lock
generated
File diff suppressed because it is too large
Load diff
Loading…
Add table
Reference in a new issue