resolve jon doe issue
This commit is contained in:
parent
310e9e97ae
commit
8499258272
5 changed files with 7 additions and 7 deletions
|
|
@ -19,8 +19,8 @@ The aim is to achieve simplicity and clarity in the knowledge graph.
|
||||||
- **Naming Convention**: Use snake_case for relationship names, e.g., `acted_in`.
|
- **Naming Convention**: Use snake_case for relationship names, e.g., `acted_in`.
|
||||||
# 3. Coreference Resolution
|
# 3. Coreference Resolution
|
||||||
- **Maintain Entity Consistency**: When extracting entities, it's vital to ensure consistency.
|
- **Maintain Entity Consistency**: When extracting entities, it's vital to ensure consistency.
|
||||||
If an entity, such as "John Doe", is mentioned multiple times in the text but is referred to by different names or pronouns (e.g., "Joe", "he"),
|
If an entity, is mentioned multiple times in the text but is referred to by different names or pronouns,
|
||||||
always use the most complete identifier for that entity throughout the knowledge graph. In this example, use "John Doe" as the Persons ID.
|
always use the most complete identifier for that entity throughout the knowledge graph.
|
||||||
Remember, the knowledge graph should be coherent and easily understandable, so maintaining consistency in entity references is crucial.
|
Remember, the knowledge graph should be coherent and easily understandable, so maintaining consistency in entity references is crucial.
|
||||||
# 4. Strict Compliance
|
# 4. Strict Compliance
|
||||||
Adhere to the rules strictly. Non-compliance will result in termination
|
Adhere to the rules strictly. Non-compliance will result in termination
|
||||||
|
|
|
||||||
|
|
@ -22,7 +22,7 @@ You are an advanced algorithm designed to extract structured information to buil
|
||||||
3. **Coreference Resolution**:
|
3. **Coreference Resolution**:
|
||||||
- Maintain one consistent node ID for each real-world entity.
|
- Maintain one consistent node ID for each real-world entity.
|
||||||
- Resolve aliases, acronyms, and pronouns to the most complete form.
|
- Resolve aliases, acronyms, and pronouns to the most complete form.
|
||||||
- *Example*: Always use "John Doe" even if later referred to as "Doe" or "he".
|
- *Example*: Always use full identifier even if later referred to as in a similar but slightly different way
|
||||||
|
|
||||||
**Property & Data Guidelines**:
|
**Property & Data Guidelines**:
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -42,10 +42,10 @@ You are an advanced algorithm designed to extract structured information from un
|
||||||
- **Rule**: Resolve all aliases, acronyms, and pronouns to one canonical identifier.
|
- **Rule**: Resolve all aliases, acronyms, and pronouns to one canonical identifier.
|
||||||
|
|
||||||
> **One-Shot Example**:
|
> **One-Shot Example**:
|
||||||
> **Input**: "John Doe is an author. Later, Doe published a book. He is well-known."
|
> **Input**: "X is an author. Later, Doe published a book. He is well-known."
|
||||||
> **Output Node**:
|
> **Output Node**:
|
||||||
> ```
|
> ```
|
||||||
> John Doe (Person)
|
> X (Person)
|
||||||
> ```
|
> ```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
|
||||||
|
|
@ -15,7 +15,7 @@ You are an advanced algorithm that extracts structured data into a knowledge gra
|
||||||
- Properties are key-value pairs; do not use escaped quotes.
|
- Properties are key-value pairs; do not use escaped quotes.
|
||||||
|
|
||||||
3. **Coreference Resolution**
|
3. **Coreference Resolution**
|
||||||
- Use a single, complete identifier for each entity (e.g., always "John Doe" not "Joe" or "he").
|
- Use a single, complete identifier for each entity
|
||||||
|
|
||||||
4. **Relationship Labels**:
|
4. **Relationship Labels**:
|
||||||
- Use descriptive, lowercase, snake_case names for edges.
|
- Use descriptive, lowercase, snake_case names for edges.
|
||||||
|
|
|
||||||
|
|
@ -26,7 +26,7 @@ Use **basic atomic types** for node labels. Always prefer general types over spe
|
||||||
- Good: "Alan Turing", "Google Inc.", "World War II"
|
- Good: "Alan Turing", "Google Inc.", "World War II"
|
||||||
- Bad: "Entity_001", "1234", "he", "they"
|
- Bad: "Entity_001", "1234", "he", "they"
|
||||||
- Never use numeric or autogenerated IDs.
|
- Never use numeric or autogenerated IDs.
|
||||||
- Prioritize **most complete form** of entity names for consistency (e.g., always use "John Doe" instead of "John" or "he").
|
- Prioritize **most complete form** of entity names for consistency
|
||||||
|
|
||||||
2. Dates, Numbers, and Properties
|
2. Dates, Numbers, and Properties
|
||||||
---------------------------------
|
---------------------------------
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue