Commit graph

30 commits

Author SHA1 Message Date
lxobr
9cc357ac1c
Feat/cog 1365 unify retrievers (#572)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Created the `BaseRetriever` class to unify all the retrievers and
searches.
- Implemented seven specialized retrievers (summaries, chunks,
completions, graph, graph-summary, insights, code) with consistent
get_context/get_completion interfaces.
- Added json context dumping feature in the current completion
implementations to enable context comparisons.
- Built a comparison framework to validate old vs new implementations.
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced multiple retrieval classes for enhanced search
capabilities, including `BaseRetriever`, `ChunksRetriever`,
`CodeRetriever`, `CompletionRetriever`, `GraphCompletionRetriever`,
`GraphSummaryCompletionRetriever`, `InsightsRetriever`, and
`SummariesRetriever`.
- Enhanced query completions with optional context saving for improved
data persistence.
- Implemented advanced tools to compare retrieval outcomes across
different implementations.

- **Refactor**
- Streamlined internal module organization and updated references for
increased maintainability and consistency.
- Added comments indicating future maintenance tasks related to code
merging.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-02-27 12:13:21 +01:00
Boris
ada466879e
fix: add default params to run_tasks (#563)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced the task execution process by enabling default values for
certain parameters, allowing users to trigger task processing without
supplying every input explicitly.
  
- **Bug Fixes**
- Adjusted asynchronous handling for the `retrieved_edges_to_string`
function to ensure proper execution flow in various components.

- **Documentation**
- Updated markdown formatting in the Jupyter notebook for improved
readability and structure.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-02-19 20:18:51 +01:00
Vasilije
4d3acc358a
fix: mcp improvements (#472)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Dependency Update**
	- Downgraded `mcp` package version from 1.2.0 to 1.1.3
- Updated `cognee` dependency to include additional features with
`cognee[codegraph]`

- **New Features**
- Introduced a new tool, "codify", for transforming codebases into
knowledge graphs
- Enhanced the existing "search" tool to accept a new parameter for
search type

- **Improvements**
	- Streamlined search functionality with a new modular approach
- Added new asynchronous function for retrieving and formatting code
parts

- **Documentation**
- Updated import paths for `SearchType` in various modules and tests to
reflect structural changes

- **Code Cleanup**
	- Removed legacy search module and associated classes/functions
	- Refined data transfer object classes for consistency and clarity
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2025-02-04 08:47:31 +01:00
Boris Arzentar
3320bc8f2c feat: add codegraph related API endpoints 2025-01-28 10:08:59 +01:00
hajdul88
e2ad54d88e Fix: deleting incorrect repo path 2025-01-10 15:54:45 +01:00
hajdul88
6177d04b44 feat: implements code retreiver 2025-01-10 13:03:34 +01:00
hajdul88
9604d95ba5 feat: adds basic retriever for swe bench 2025-01-09 19:54:58 +01:00
Rita Aleksziev
18bb282fbc Adjust SWE-bench script to code graph pipeline call 2025-01-09 14:52:02 +01:00
vasilije
60c8fd103b ruff format 2025-01-05 19:09:08 +01:00
lxobr
da5e3ab24d
COG 870 Remove duplicate edges from the code graph (#293)
* feat: turn summarize_code into generator

* feat: extract run_code_graph_pipeline, update the pipeline

* feat: minimal code graph example

* refactor: update argument

* refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline

* refactor: indentation and whitespace nits

* refactor: add deprecated use comments and warnings

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2024-12-17 12:02:25 +01:00
Boris
348610e73c
fix: refactor get_graph_from_model to return nodes and edges correctly (#257)
* fix: handle rate limit error coming from llm model

* fix: fixes lost edges and nodes in get_graph_from_model

* fix: fixes database pruning issue in pgvector (#261)

* fix: cognee_demo notebook pipeline is not saving summaries

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-12-06 12:52:01 +01:00
Boris Arzentar
d49ab4c3b5 feat: update code-graph notebook 2024-12-03 23:48:12 +01:00
Boris Arzentar
11acabdb6a fix: remove duplicate nodes and edges before saving; Fix FalkorDB vector index; 2024-12-02 10:10:18 +01:00
Rita Aleksziev
a4c56f118d Connect code graph pipeline + retriever + benchmarking 2024-11-29 15:24:49 +01:00
Rita Aleksziev
4da1657140 merge changes from code-graph 2024-11-29 12:16:36 +01:00
Rita Aleksziev
8f241fa6c5 convert edge to string 2024-11-29 12:05:52 +01:00
Leon Luithlen
a5ae9185cd Replicate PR 33 2024-11-29 11:40:51 +01:00
Leon Luithlen
d9fc740ec0 Fix merge conflicts 2024-11-29 11:33:05 +01:00
Leon Luithlen
b46af5a6f6 Update eval_swe_bench 2024-11-29 11:31:03 +01:00
Leon Luithlen
1bfa3a0ea3 Rebase onto code-graph 2024-11-29 11:30:30 +01:00
Rita Aleksziev
8edfe7c5a4 feat/connect code graph pipeline to benchmarking 2024-11-28 16:52:54 +01:00
Boris Arzentar
2408fd7a01 fix: falkordb adapter errors 2024-11-28 09:12:37 +01:00
Boris
64b8aac86f
feat: code graph swe integration
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: hande-k <handekafkas7@gmail.com>
Co-authored-by: Igor Ilic <igorilic03@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2024-11-27 09:32:29 +01:00
Rita Aleksziev
e1d8f3ea86 use acreate_structured_output instead of create_structured_output in eval script 2024-11-20 16:02:15 +01:00
Rita Aleksziev
2948089806 Read patch generation instructions from file 2024-11-19 14:07:53 +01:00
Rita Aleksziev
838d98238a Code cleanup 2024-11-19 13:54:04 +01:00
Rita Aleksziev
98e3445c2c running swebench evaluation as subprocess 2024-11-18 15:12:36 +01:00
Rita Aleksziev
ed08cdb9f9 using the code graph pipeline instead of cognify 2024-11-15 17:56:19 +01:00
Rita Aleksziev
721fde3d60 generating testspecs for data 2024-11-15 17:14:43 +01:00
Rita Aleksziev
094ba7233e Running inference with and without cognee 2024-11-14 16:28:03 +01:00