fix: Remove Jon Doe enitity reference due to hallucination issues (#1939)
<!-- .github/pull_request_template.md -->
## Description
<!--
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->
## Acceptance Criteria
<!--
* Key requirements to the new feature or modification;
* Proof that the changes work and meet the requirements;
* Include instructions on how to verify the changes. Describe how to
test it locally;
* Proof that it's sufficiently tested.
-->
## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):
## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->
## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Clarifies and tightens coreference resolution guidance across
knowledge-graph prompt templates.
>
> - Updates coreference rules to emphasize using the most complete,
human-readable identifiers consistently (`generate_graph_prompt*.txt`)
> - Tweaks examples, notably replacing the John Doe example with a
generic "X" case in the one-shot prompt
> - Minor wording/formatting cleanups; no code changes or logic
modifications
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
8499258272. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Refined entity resolution guidance in knowledge graph generation
prompts to use more generic instructions, improving flexibility and
consistency in how entities are identified throughout the system.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This commit is contained in:
commit
f03ab671e6
5 changed files with 7 additions and 7 deletions
|
|
@ -19,8 +19,8 @@ The aim is to achieve simplicity and clarity in the knowledge graph.
|
|||
- **Naming Convention**: Use snake_case for relationship names, e.g., `acted_in`.
|
||||
# 3. Coreference Resolution
|
||||
- **Maintain Entity Consistency**: When extracting entities, it's vital to ensure consistency.
|
||||
If an entity, such as "John Doe", is mentioned multiple times in the text but is referred to by different names or pronouns (e.g., "Joe", "he"),
|
||||
always use the most complete identifier for that entity throughout the knowledge graph. In this example, use "John Doe" as the Persons ID.
|
||||
If an entity, is mentioned multiple times in the text but is referred to by different names or pronouns,
|
||||
always use the most complete identifier for that entity throughout the knowledge graph.
|
||||
Remember, the knowledge graph should be coherent and easily understandable, so maintaining consistency in entity references is crucial.
|
||||
# 4. Strict Compliance
|
||||
Adhere to the rules strictly. Non-compliance will result in termination
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ You are an advanced algorithm designed to extract structured information to buil
|
|||
3. **Coreference Resolution**:
|
||||
- Maintain one consistent node ID for each real-world entity.
|
||||
- Resolve aliases, acronyms, and pronouns to the most complete form.
|
||||
- *Example*: Always use "John Doe" even if later referred to as "Doe" or "he".
|
||||
- *Example*: Always use full identifier even if later referred to as in a similar but slightly different way
|
||||
|
||||
**Property & Data Guidelines**:
|
||||
|
||||
|
|
|
|||
|
|
@ -42,10 +42,10 @@ You are an advanced algorithm designed to extract structured information from un
|
|||
- **Rule**: Resolve all aliases, acronyms, and pronouns to one canonical identifier.
|
||||
|
||||
> **One-Shot Example**:
|
||||
> **Input**: "John Doe is an author. Later, Doe published a book. He is well-known."
|
||||
> **Input**: "X is an author. Later, Doe published a book. He is well-known."
|
||||
> **Output Node**:
|
||||
> ```
|
||||
> John Doe (Person)
|
||||
> X (Person)
|
||||
> ```
|
||||
|
||||
---
|
||||
|
|
|
|||
|
|
@ -15,7 +15,7 @@ You are an advanced algorithm that extracts structured data into a knowledge gra
|
|||
- Properties are key-value pairs; do not use escaped quotes.
|
||||
|
||||
3. **Coreference Resolution**
|
||||
- Use a single, complete identifier for each entity (e.g., always "John Doe" not "Joe" or "he").
|
||||
- Use a single, complete identifier for each entity
|
||||
|
||||
4. **Relationship Labels**:
|
||||
- Use descriptive, lowercase, snake_case names for edges.
|
||||
|
|
|
|||
|
|
@ -26,7 +26,7 @@ Use **basic atomic types** for node labels. Always prefer general types over spe
|
|||
- Good: "Alan Turing", "Google Inc.", "World War II"
|
||||
- Bad: "Entity_001", "1234", "he", "they"
|
||||
- Never use numeric or autogenerated IDs.
|
||||
- Prioritize **most complete form** of entity names for consistency (e.g., always use "John Doe" instead of "John" or "he").
|
||||
- Prioritize **most complete form** of entity names for consistency
|
||||
|
||||
2. Dates, Numbers, and Properties
|
||||
---------------------------------
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue