Commit graph

1 commit

Author SHA1 Message Date
Nate Shumway
0f10fb6bdd Improve semantic equivalence detection in edge deduplication
Enhanced the edge deduplication prompts to better recognize semantically
equivalent facts that use different phrasings:

- Self-referential relationships ("X is a sub-agency of X" = "X is its own sub-agency")
- Active vs passive voice ("A awarded contract to B" = "B received contract from A")
- Numeric format equivalence ($1M = $1,000,000)
- Entity aliases (DoD = Department of Defense)

Added integration tests that verify the LLM correctly identifies semantic
duplicates with the improved prompts.
2025-12-09 19:01:05 -06:00