Commit graph

80 commits

Author SHA1 Message Date
yangdx
06db511f3b Remove angle brackets from entity and relationship output formats 2025-09-09 09:21:23 +08:00
yangdx
d218f15a62 Refactor entity extraction with system prompts and output limits
- Add system/user prompt separation
- Set max tokens for endless output fix
- Improve extraction error logging
- Update cache type from extract to summary
2025-09-08 15:20:45 +08:00
yangdx
725db3b240 Fix linting in the prompt 2025-09-06 11:16:49 +08:00
yangdx
219a08b7c9 Restore completion_delimiter 2025-09-06 11:13:37 +08:00
yangdx
528d04a0e4 Update prompt template delimiters 2025-09-06 10:35:06 +08:00
yangdx
5446815008 Refactor entity extraction prompts and remove completion delimiter.
- Remove `completion_delimiter` from prompts
- Update input/output format markers
2025-09-06 09:13:51 +08:00
yangdx
be3f0ebbe5 Simplify entity extraction prompt instructions and remove delimiter 2025-09-04 23:42:11 +08:00
yangdx
3f56c6820c Reorder language and completion delimiter instructions in prompt 2025-09-04 23:05:16 +08:00
yangdx
50adf64fab Fix linting in prompt 2025-09-04 15:22:36 +08:00
yangdx
94114df995 Improve prompt clarity and structure 2025-09-04 14:53:27 +08:00
yangdx
7b35657e32 Refactor entity extraction prompt formatting and clarity
- Remove quotes from tuple format strings
- Simplify relationship extraction text
- Add relationships to quality guidelines
2025-09-04 10:47:57 +08:00
yangdx
78abb397bf Reorder entity types and add Document type to extraction 2025-09-03 12:44:40 +08:00
yangdx
95c08cc7dc Improve entity extraction prompt clarity by replacing pronouns with specific nouns 2025-09-03 12:35:52 +08:00
yangdx
c86f863fa4 feat: optimize entity extraction for smaller LLMs
Simplify entity relationship extraction process to improve compatibility
and performance with smaller, less capable language models.

Changes:
- Remove iterative gleaning loop with LLM-based continuation decisions
- Simplify to single gleaning pass when entity_extract_max_gleaning > 0
- Streamline entity extraction prompts with clearer instructions
- Add explicit completion delimiter signals in all examples
2025-09-03 10:33:01 +08:00
yangdx
29f0ecc88c Refactor entity extraction prompts and remove completion delimiter
• Update prompt structure and wording
• Remove deprecated completion delimiter
• Add quality guidelines section
• Improve instruction clarity
• Enhance continue extraction prompt
2025-09-02 02:14:14 +08:00
yangdx
692357fbf3 Add conflict resolution instruction to entity summarization prompt
- Add conflict handling step
- Handle entities with same name
- Separate then consolidate summaries
2025-09-01 08:51:19 +08:00
yangdx
ec059d1b5d Fix typo and clarify delimiter formatting in relationship extraction Prompt
- Fix "feild" → "field" typo
- Clarify delimiter spacing rules
2025-09-01 00:42:59 +08:00
yangdx
4e751e0653 refac: Enhance extraction with improved prompts and parser
-   **Prompts**: Restructured prompts with clearer steps and quality guidelines. Simplified the relationship tuple by removing `relationship_strength`
-   **Model**: Updated default entity types to be more comprehensive and consistently capitalized (e.g., `Location`, `Product`)
2025-08-31 22:24:11 +08:00
yangdx
ff0a18e08c Unify SUMMARY_LANGUANGE and ENTITY_TYPES implementation method 2025-08-27 12:23:22 +08:00
Thibo Rosemplatt
c3aabfc251 Merge branch 'main' into entityTypesServerSupport 2025-08-26 21:48:20 +02:00
yangdx
e0a755e42c Refactor prompt instructions to emphasize depth and completeness 2025-08-26 18:28:57 +08:00
yangdx
01a2c79f29 Standardize prompt formatting and section headers across templates
- Remove hash delimiters
- Consistent section headers
- Add "Output:" labels
- Clean up example formatting
2025-08-26 14:42:52 +08:00
yangdx
6bcfe696ee feat: add output length recommendation and description type to LLM summary
- Add SUMMARY_LENGTH_RECOMMENDED parameter (600 tokens)
- Optimize prompt temple for LLM summary
2025-08-26 14:41:12 +08:00
Thibo Rosemplatt
d054ec5d00 Added entity_types as a user defined variable (via .env) 2025-08-23 20:16:11 +02:00
yangdx
950221db59 Refactor keyword extraction rules and remove overlap constraint
• Require content in both keyword categories
• Remove no-overlap rule between lists
• Simplify edge case handling
• Clarify source of truth requirement
2025-08-19 15:12:15 +08:00
yangdx
92c0ad0076 Fix linting 2025-08-19 00:45:29 +08:00
yangdx
23334e7e51 Update prompt.py 2025-08-19 00:29:33 +08:00
yangdx
2a7fec2873 Optimize keyword extraction prompt, and remove conversation history from keywork extraction.
- Remove history context processing
- Update prompt to focus on single query
- Clarify high/low level keyword types
- Improve JSON output instructions
- Add edge case handling guidance
2025-08-18 23:35:04 +08:00
yangdx
8d7a7e4ad6 Refactor prompt templates with improved guidelines and citation formats 2025-08-18 19:14:32 +08:00
yangdx
9c4e98ec3b Unify entity extraction prompt between passes
- Disallow hallucinated info in descriptions
- Align reminder steps with main extraction
2025-07-27 23:06:55 +08:00
Daniel.y
4eef9f3778
Merge pull request #1845 from AkosLukacs/patch-2
Better prompt for entity description extraction to avoid hallucinations
2025-07-27 22:38:08 +08:00
yangdx
f2d051eea5 Fix: Improve keyword extraction prompt for robust JSON output.
*   Emphasize strict JSON output in key extration prompt
*   Clean up prompt examples in key extration prompt
*   Log raw LLM response on JSON error
2025-07-27 21:10:47 +08:00
yangdx
cf1ca39b3f Refine entity continuation prompt to avoid duplicates.
- Clarify finding missing entities
- Instruct not to repeat extractions
2025-07-27 10:48:29 +08:00
Ákos Lukács
75beaf249e
Better prompt for entity description extraction to avoid hallucinations 2025-07-22 20:40:46 +02:00
zrguo
681d43bb32 fix typo 2025-07-22 15:34:51 +08:00
yangdx
da46b341dc feat: Optimize document deletion performance
- To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency.
- Graph storage updated: Networkx, Neo4j, Postgres AGE
2025-06-25 12:37:57 +08:00
yangdx
b92f9b9453 Optimizing query prompt 2025-05-08 12:53:28 +08:00
yangdx
2bafc87a80 Add comment for deprecated PROMPT template 2025-05-08 09:40:38 +08:00
yangdx
ae1c9f8d10 Add user_prompt the QueryParam 2025-05-08 03:38:47 +08:00
yangdx
474b77c43e Remove deprecated mix_rag_response prompt template 2025-05-07 18:11:35 +08:00
yangdx
1c5bbe396a Optimize prompt template for naive query 2025-05-07 18:11:12 +08:00
Chen Yuwen
8bf78f0823
Update prompt.py
missing ‘)’ in PROMPTS["entity_continue_extraction"] lead to misunderstanding prompt for some small models and can not responsing correctly.
2025-04-21 16:04:19 +08:00
zrguo
87fbffde14 fix citation 2025-03-28 13:30:24 +08:00
zrguo
bf18a5406e add citation 2025-03-17 23:32:35 +08:00
Zhenya Zhu
37754f14b5
force keywords_extraction output as JSON 2025-03-11 11:54:30 +08:00
zrguo
c936aaf5c8 fix linting 2025-03-09 01:29:21 +08:00
zrguo
595d8bf372 Update prompt.py 2025-03-09 01:25:15 +08:00
zrguo
548f9a8234 Update prompts 2025-03-09 01:21:39 +08:00
Zhichun Wu
d79a9d7acc consistent format 2025-02-26 23:04:21 +08:00
Yannick Stephan
8958046b74 cleaned code 2025-02-19 22:07:25 +01:00