Commit graph

1312 commits

Author SHA1 Message Date
hajdul88
198f71b9be
feat: Implements multiprocessing for get_repo_file_dependencies task (#43) 2024-12-01 11:51:04 +01:00
Vasilije
bbaf78f54e
Cog 669 implement dummy llm adapter (#37)
Adds the `class DummyLLMAdapter(LLMInterface)` class for profiling of
large datasets without actual LLM calls in the top level
`profiling/util` location.

I also move the `show_prompt` implementation from the child classes to
`LLMInterface`, since the implementations were identical.

I expanded the scope to also include a DummyEmbeddingEngine.
2024-11-30 17:02:49 +01:00
Vasilije
4d02560f1c
Cog 519 develop metadata storage integration (#35)
@borisarzentar this PR is ready, the all checks run through in the
"sister" MR targeting main:
https://github.com/topoteretes/cognee-private/pull/26
2024-11-30 17:02:18 +01:00
Vasilije
57754b3ca0
Connect pipeline to benchmark (#42)
evals/eval_swe_bench runs the code graph pipeline, adds retrieval to the
end, then connects the whole thing with swe-bench

Some unnecessary utility functions were removed.

Note: the pipeline is called for a "graphrag" folder as an example, due
to bugs in the pipeline.
2024-11-29 17:05:37 +01:00
Rita Aleksziev
a4c56f118d Connect code graph pipeline + retriever + benchmarking 2024-11-29 15:24:49 +01:00
Leon Luithlen
bc82430fb5 Merge latest COG-519 2024-11-29 14:36:03 +01:00
Rita Aleksziev
4da1657140 merge changes from code-graph 2024-11-29 12:16:36 +01:00
Rita Aleksziev
8f241fa6c5 convert edge to string 2024-11-29 12:05:52 +01:00
0xideas
56673d360c
Cog 692 run swe bench on ec2 (#25)
Mainly a tutorial and some small improvements to the evaluation code
itself
2024-11-29 11:50:21 +01:00
Leon Luithlen
a5ae9185cd Replicate PR 33 2024-11-29 11:40:51 +01:00
Leon Luithlen
d9fc740ec0 Fix merge conflicts 2024-11-29 11:33:05 +01:00
Leon Luithlen
b46af5a6f6 Update eval_swe_bench 2024-11-29 11:31:03 +01:00
Leon Luithlen
618d476c30 Add code formating to usermod command 2024-11-29 11:30:39 +01:00
Leon Luithlen
5036f3a85f Add -y to setup_ubuntu_instance.sh commands and update EC2_README 2024-11-29 11:30:39 +01:00
Leon Luithlen
1bfa3a0ea3 Rebase onto code-graph 2024-11-29 11:30:30 +01:00
Rita Aleksziev
8edfe7c5a4 feat/connect code graph pipeline to benchmarking 2024-11-28 16:52:54 +01:00
Leon Luithlen
3e1949d895 Remove unnecessary nesting in embed_text and add DummyEmbeddingEngine 2024-11-28 15:42:20 +01:00
Leon Luithlen
5c9fd44680 Fix DummyLLMAdapter 2024-11-28 12:26:01 +01:00
hajdul88
6339295d6b
Deleting old files that are duplicated due to the different branches (#36) 2024-11-28 12:21:51 +01:00
hajdul88
72a8bc43a1 Deleting code_graph_pipeline not working entrypoint
From now on eval_swe_bench contains and rung the updated version of the pipeline
2024-11-28 12:19:08 +01:00
hajdul88
c094898d15 fix: deletes duplicated retriever instances 2024-11-28 12:12:36 +01:00
Leon Luithlen
a2ff42332e DummyLLMAdapter WIP 2024-11-28 11:49:28 +01:00
Leon Luithlen
d4e77636b5 Revert spaces around args 2024-11-28 09:18:49 +01:00
Leon Luithlen
15802237e9 Get metadata from metadata table 2024-11-28 09:18:49 +01:00
Leon Luithlen
cd0e505ac0 WIP 2024-11-28 09:18:49 +01:00
Leon Luithlen
1679c746a3 Move class and functions to data.models 2024-11-28 09:18:49 +01:00
Leon Luithlen
3d5cb7644a Pass DocumentChunk metadata_id to _metadata field 2024-11-28 09:18:49 +01:00
Leon Luithlen
aacba555c9 Remove passing of metadata_id to DocumentChunk 2024-11-28 09:18:49 +01:00
Leon Luithlen
80517f5117 Revert README 2024-11-28 09:18:49 +01:00
Leon Luithlen
159985b501 Remove line in README 2024-11-28 09:18:49 +01:00
Leon Luithlen
9e93ea0794 Make save_data_item_with_metadata_to_storage async 2024-11-28 09:18:49 +01:00
Leon Luithlen
5b5c1ea5c6 Fix module import error 2024-11-28 09:18:49 +01:00
Leon Luithlen
20d721f5ca Add metadata_id field to documents in integration tests 2024-11-28 09:18:49 +01:00
Leon Luithlen
899275c25e Rename metadata field to metadata_repr 2024-11-28 09:18:49 +01:00
Leon Luithlen
cc0127a90e Fix Metadata file name 2024-11-28 09:18:49 +01:00
Leon Luithlen
7324564655 Add metadata_id attribute to Document and DocumentChunk, make ingest_with_metadata default 2024-11-28 09:18:49 +01:00
Leon Luithlen
fd987ed61e Add autoformatting 2024-11-28 09:18:49 +01:00
Leon Luithlen
c5f3314c85 Add Metadata table and read write delete functions 2024-11-28 09:18:49 +01:00
Boris Arzentar
2408fd7a01 fix: falkordb adapter errors 2024-11-28 09:12:37 +01:00
Boris
6403d15a76
fix: enable falkordb and add test for it (#31) 2024-11-27 22:55:30 +01:00
Boris Arzentar
d885a047ac Merge remote-tracking branch 'origin/main' into code-graph 2024-11-27 22:54:49 +01:00
hajdul88
be6eebfbb1
Feature/cog 537 implement retrieval algorithm from research paper (#8) 2024-11-27 17:26:11 +01:00
hajdul88
3146ef75c9 Fix: renames new vector db and cogneegraph methods 2024-11-27 13:47:26 +01:00
hajdul88
94dc545fcd chore: adds self to cogneegraph edges 2024-11-27 11:42:35 +01:00
Boris
64b8aac86f
feat: code graph swe integration
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: hande-k <handekafkas7@gmail.com>
Co-authored-by: Igor Ilic <igorilic03@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2024-11-27 09:32:29 +01:00
hajdul88
c30683e20e chore: changes query text in tests 2024-11-26 17:29:44 +01:00
hajdul88
98a517dd9f feat: extends brute force triplet search for weaviate db 2024-11-26 17:20:53 +01:00
hajdul88
4c9d816f87 feat: extends bruteforce triplet search for Qdrant db 2024-11-26 17:05:38 +01:00
hajdul88
4035302dd4 feat: Adds tests for pgvector, qdrant and weaviate 2024-11-26 16:48:09 +01:00
hajdul88
0441e19bc9 feat: Adds bruteforce retriever test for neo4j 2024-11-26 16:42:35 +01:00