Leon Luithlen
|
84c98f16bb
|
Remove chunk_index attribute from chunk_by_sentence return value
|
2024-11-14 16:49:13 +01:00 |
|
Leon Luithlen
|
15420dd864
|
Fix paragraph_ids handling
|
2024-11-14 16:47:51 +01:00 |
|
Leon Luithlen
|
7cf8c74cf9
|
Merge latest main
|
2024-11-14 15:05:57 +01:00 |
|
Leon Luithlen
|
d6a6a9eaba
|
Return sentence_cut instead of word in chunk_by_paragraph
|
2024-11-14 15:03:09 +01:00 |
|
Vasilije
|
535d8281b4
|
Merge pull request #215 from topoteretes/clean_dspy
Remove dspy logic that confuses
|
2024-11-14 14:51:51 +01:00 |
|
Vasilije
|
bc2e17592d
|
Merge branch 'main' into clean_dspy
|
2024-11-14 14:50:43 +01:00 |
|
Vasilije
|
36ada5974d
|
Delete cognee/modules/cognify/dataset.py
|
2024-11-14 14:49:45 +01:00 |
|
Vasilije
|
8e9040815f
|
Delete cognee/modules/cognify/train.py
|
2024-11-14 14:49:34 +01:00 |
|
Vasilije
|
cf09a5ea37
|
Delete cognee/modules/cognify/test.py
|
2024-11-14 14:49:23 +01:00 |
|
Vasilije
|
c5d132ed14
|
Delete cognee/modules/cognify/evaluate.py
|
2024-11-14 14:49:08 +01:00 |
|
hajdul88
|
c1007091d1
|
Merge pull request #196 from topoteretes/feat/COG-553-graph-memory-projection
Feat/cog 553 graph memory projection
|
2024-11-14 14:48:41 +01:00 |
|
0xideas
|
8b681529b1
|
Update cognee/tasks/chunks/chunk_by_paragraph.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
2024-11-14 14:42:15 +01:00 |
|
Leon Luithlen
|
73f24f9e4d
|
Fix sentence_cut return value in inappropriate places
|
2024-11-14 14:40:42 +01:00 |
|
Leon Luithlen
|
b4d509e682
|
Set batch_paragraph=False in run_chunking_test
|
2024-11-14 14:23:09 +01:00 |
|
Leon Luithlen
|
a52d3ac6ba
|
Change document test ground truth values for new chunk_by_word
|
2024-11-14 14:20:18 +01:00 |
|
Leon Luithlen
|
eaf9167fa1
|
Change chunk_by_word to collect newlines in prior words
|
2024-11-14 14:19:34 +01:00 |
|
hajdul88
|
867e18de86
|
fix: Changes GraphDBInterface typing in CogneeGraph
|
2024-11-14 14:01:20 +01:00 |
|
Leon Luithlen
|
57d8149732
|
Save paragraph_ids in chunk_by_paragraph
|
2024-11-14 13:59:54 +01:00 |
|
Leon Luithlen
|
6721eaee83
|
Fix chunk_index bug in chunk_by_paragraph
|
2024-11-14 13:50:40 +01:00 |
|
0xideas
|
f2206a09c0
|
Update cognee/tasks/chunks/chunk_by_word.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
2024-11-14 13:16:17 +01:00 |
|
Leon Luithlen
|
8260647497
|
Add AudioDocument and ImageDocument tests
|
2024-11-14 12:42:10 +01:00 |
|
Leon Luithlen
|
f87fd12e9b
|
Fix lambda bug in AudioDocument and ImageDocument
|
2024-11-14 12:41:47 +01:00 |
|
Leon Luithlen
|
8b3b2f8156
|
Add transcribe_image and create_transcript methods
|
2024-11-14 11:59:46 +01:00 |
|
hajdul88
|
32504255ef
|
feat: Adds unit tests to CogneeGraph class
|
2024-11-14 11:46:17 +01:00 |
|
hajdul88
|
b516862edc
|
Fix: Fixes import paths
|
2024-11-14 11:44:43 +01:00 |
|
Leon Luithlen
|
c905510f30
|
Change test_input order
|
2024-11-14 11:44:18 +01:00 |
|
Leon Luithlen
|
e6636754ff
|
Add TextDocument_test.py
|
2024-11-14 11:39:14 +01:00 |
|
Leon Luithlen
|
8afb25e0d4
|
Move PdfDocument_test.py to integration tests
|
2024-11-14 11:24:11 +01:00 |
|
Leon Luithlen
|
e794bb8834
|
Return stripped value from get_embeddable_data if its string
|
2024-11-14 09:43:56 +01:00 |
|
Leon Luithlen
|
adc8a0b09c
|
Add ellipsis test string
|
2024-11-14 09:43:32 +01:00 |
|
Leon Luithlen
|
d90698305b
|
Simplify chunk_by_word
|
2024-11-14 09:43:10 +01:00 |
|
hajdul88
|
d3fdddaa52
|
Revert "Checks the pgvector test issue"
This reverts commit 0d27371467.
|
2024-11-13 17:55:52 +01:00 |
|
hajdul88
|
0d27371467
|
Checks the pgvector test issue
|
2024-11-13 17:51:25 +01:00 |
|
hajdul88
|
d8024db002
|
fix: Fixes edge case handling
|
2024-11-13 17:18:07 +01:00 |
|
hajdul88
|
bf4eedd20e
|
Merge branch 'main' into feat/COG-553-graph-memory-projection
|
2024-11-13 16:45:13 +01:00 |
|
hajdul88
|
8e3a991dd0
|
feat: implements DB projection to memory
|
2024-11-13 16:38:57 +01:00 |
|
Leon Luithlen
|
45a60b7f19
|
Remove assert and move is_real_paragraph_end outside loop
|
2024-11-13 16:35:47 +01:00 |
|
hajdul88
|
68bfb87f3a
|
feat: Extends graph elements with new features
|
2024-11-13 16:34:36 +01:00 |
|
Leon Luithlen
|
b787407db7
|
Add more adversarial examples
|
2024-11-13 16:23:14 +01:00 |
|
Leon Luithlen
|
fdec9a692e
|
Test maximum_lenth parameter of chunk_by_sentence
|
2024-11-13 16:03:06 +01:00 |
|
Leon Luithlen
|
9ea2634480
|
Replace word_count with maximum_length in if clause
|
2024-11-13 15:53:44 +01:00 |
|
Leon Luithlen
|
9b2fb09c59
|
Fix PdfDocument teset, give chunk_by_sentence a maximum_length arg
|
2024-11-13 15:39:17 +01:00 |
|
Leon Luithlen
|
1b4a7e4fdc
|
Adapt chunk_by_paragraph_test.py
|
2024-11-13 15:35:03 +01:00 |
|
Leon Luithlen
|
f8e5b529c3
|
Add maximum_length argument to chunk_sentences
|
2024-11-13 15:35:03 +01:00 |
|
Leon Luithlen
|
ef7a19043d
|
Adapt chunk_by_paragraph test parametrization
|
2024-11-13 15:35:03 +01:00 |
|
Leon Luithlen
|
92a66dddb9
|
Autoformat chunking tests
|
2024-11-13 15:35:03 +01:00 |
|
Leon Luithlen
|
ce498d97dd
|
Refactor chunk_by_paragraph to be isomorphic
|
2024-11-13 15:35:03 +01:00 |
|
Leon Luithlen
|
ab55a73d18
|
Adapt chunk_by_sentence to isomorphic chunk_by_word
|
2024-11-13 15:35:03 +01:00 |
|
Leon Luithlen
|
c054e897a3
|
Make chunk_by_word isomorphic
|
2024-11-13 15:35:03 +01:00 |
|
Leon Luithlen
|
830c6710e0
|
Fix chunk_by_word_test
|
2024-11-13 15:35:02 +01:00 |
|