cognee

Author	SHA1	Message	Date
alekszievr	219b68c6b0	chore: Remove old eval files [cog-1567] (#649 ) <!-- .github/pull_request_template.md --> ## Description Removed old, unused eval files. - swe-bench eval files are kept here as swe-bench eval is not handled by the new eval framework - EC2_readme and cloud/setup_ubuntu_instance.sh will be removed (and moved to the docs website) as part of another task ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin	2025-03-17 19:19:39 +01:00
hajdul88	e3f3d49a3b	Feature/cog 1312 integrating evaluation framework into dreamify (#562 ) <!-- .github/pull_request_template.md --> ## Description This PR contains eval framework changes due to the autooptimizer integration ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced answer generation now returns structured answer details. - Search functionality accepts configurable prompt inputs. - Option to generate a metrics dashboard from evaluations. - Corpus building tasks now support adjustable chunk settings for greater flexibility. - New task retrieval functionality allows for flexible task configuration. - Introduced new methods for creating and managing metrics dashboards. - Refactor/Chore - Streamlined API signatures and reorganized module interfaces for better consistency. - Updated import paths to reflect new module structure. - Tests - Updated test scenarios to align with new configurations and parameter adjustments. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-03 19:55:47 +01:00
lxobr	bee04cad86	Feat/cog 1331 modal run eval (#576 ) <!-- .github/pull_request_template.md --> ## Description - Split metrics dashboard into two modules: calculator (statistics) and generator (visualization) - Added aggregate metrics as a new phase in evaluation pipeline - Created modal example to run multiple evaluations in parallel and collect results into a single combined output ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced metrics reporting with improved visualizations, including histogram and confidence interval plots. - Introduced an asynchronous evaluation process that supports parallel execution and streamlined result aggregation. - Added new configuration options to control metrics calculation and aggregated output storage. - Refactor - Restructured dashboard generation and evaluation workflows into a more modular, maintainable design. - Improved error handling and logging for better feedback during evaluation processes. - Bug Fixes - Updated test cases to ensure accurate validation of the new dashboard generation and metrics calculation functionalities. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-03 14:22:32 +01:00
lxobr	ca2cbfab91	feat: add direct llm eval adapter (#591 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> • Created DirectLLMEvalAdapter - a lightweight alternative to DeepEval for answer evaluation • Added evaluation prompt files defining scoring criteria and format • Made adapter selectable via evaluation_engine = "DirectLLM" in config, supports "correctness" metric only ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a new evaluation method that compares model responses against a reference answer using structured prompt templates. This approach enables automated scoring (ranging from 0 to 1) along with brief justifications. - Enhancements - Updated the configuration to clearly distinguish between evaluation options, providing end-users with a more transparent and reliable assessment process. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-03-01 19:50:20 +01:00
Daniel Molnar	d27f847753	Transition to new retrievers, update searches (#585 ) <!-- .github/pull_request_template.md --> ## Description Delete legacy search implementations after migrating to new retriever classes ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced search and retrieval capabilities, providing improved context resolution for code queries, completions, summaries, and graph connections. - Refactor - Shifted to a modular, object-oriented approach that consolidates query logic and streamlines error management for a more robust and scalable experience. - Bug Fixes - Improved error handling for unsupported search types and retrieval operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 15:25:24 +01:00
lxobr	4b7c21d7d8	feat: retrieve golden contexts [COG-1364] (#579 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> • Added load_golden_context parameter to BaseBenchmarkAdapter's abstract load_corpus method, establishing a common interface for retrieving supporting evidence • Refactored HotpotQAAdapter with a modular design: introduced _get_metadata_field_name method to handle dataset-specific fields (making it extensible for child classes), implemented get golden context functionality. • Refactored TwoWikiMultihopAdapter to inherit from HotpotQAAdapter, overriding only the necessary methods while reusing parent's functionality • Added golden context support to MusiqueQAAdapter with their decomposition-based format ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced an option to include additional context during corpus loading, enhancing the quality and flexibility of generated QA pairs. - Refactor - Streamlined and modularized the processing workflow across different adapters for improved consistency and maintainability. - Updated metadata extraction to refine the display of contextual information. - Shifted focus in the `TwoWikiMultihopAdapter` from corpus loading to context extraction. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-27 13:25:47 +01:00
lxobr	9cc357ac1c	Feat/cog 1365 unify retrievers (#572 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> - Created the `BaseRetriever` class to unify all the retrievers and searches. - Implemented seven specialized retrievers (summaries, chunks, completions, graph, graph-summary, insights, code) with consistent get_context/get_completion interfaces. - Added json context dumping feature in the current completion implementations to enable context comparisons. - Built a comparison framework to validate old vs new implementations. ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced multiple retrieval classes for enhanced search capabilities, including `BaseRetriever`, `ChunksRetriever`, `CodeRetriever`, `CompletionRetriever`, `GraphCompletionRetriever`, `GraphSummaryCompletionRetriever`, `InsightsRetriever`, and `SummariesRetriever`. - Enhanced query completions with optional context saving for improved data persistence. - Implemented advanced tools to compare retrieval outcomes across different implementations. - Refactor - Streamlined internal module organization and updated references for increased maintainability and consistency. - Added comments indicating future maintenance tasks related to code merging. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2025-02-27 12:13:21 +01:00
lxobr	1cb83312fe	feat: add experimental cognify pipeline [COG-1293] (#541 ) <!-- .github/pull_request_template.md --> ## Description - Integrate experimental tasks into the evaluation framework ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced interactive prompt templates for extracting graph nodes, edge triplets, and relationship names, resulting in more comprehensive and accurate knowledge graphs. - Added asynchronous processes to efficiently handle document data and integrate graph components. - Launched cascade graph task options to offer enhanced flexibility in task management workflows. - Added new functionality for extracting content nodes and relationship names from text. - Refactor - Streamlined configurations for prompt processing and task initialization, improving overall modularity and system stability. - Updated task getter mechanisms to utilize function-based approaches for improved flexibility. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com> Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-02-25 16:14:27 +01:00
alekszievr	17231de5d0	Test: Parse context pieces separately in MusiqueQAAdapter and adjust tests [cog-1234] (#561 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Tests - Updated evaluation checks by removing assertions related to the relationship between `corpus_list` and `qa_pairs`, now focusing solely on `qa_pairs` limits. - Refactor - Improved content processing to append each paragraph individually to `corpus_list`, enhancing clarity in data structure. - Simplified type annotations in the `load_corpus` method across multiple adapters, ensuring consistency in return types. - Chores - Updated dependency installation commands in GitHub Actions workflows for Python 3.10, 3.11, and 3.12 to include additional evaluation-related dependencies. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>	2025-02-20 14:23:53 +01:00
Boris	ada466879e	fix: add default params to run_tasks (#563 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced the task execution process by enabling default values for certain parameters, allowing users to trigger task processing without supplying every input explicitly. - Bug Fixes - Adjusted asynchronous handling for the `retrieved_edges_to_string` function to ensure proper execution flow in various components. - Documentation - Updated markdown formatting in the Jupyter notebook for improved readability and structure. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-02-19 20:18:51 +01:00
Vasilije	e98d51aac9	Add musique adapter base (#525 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes - Improved data handling by updating the dataset file path and ensuring answers are consistently converted to lowercase for reliable processing. - Tests - Introduced unit tests to validate that data adapters instantiate correctly, return non-empty content, and respect specified limits. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2025-02-18 19:48:22 +01:00
lxobr	bb8cb692e0	Cog 1293 corpus builder custom cognify tasks (#527 ) <!-- .github/pull_request_template.md --> ## Description - Enable custom tasks in corpus building ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced a configurable option to specify the task retrieval strategy during corpus building. - Enhanced the workflow with integrated task fetching, featuring a default retrieval mechanism. - Updated evaluation configuration to support customizable task selection for more flexible operations. - Added a new abstract base class for defining various task retrieval strategies. - Introduced a new enumeration to map task getter types to their corresponding classes. - Dependencies - Added a new dependency for downloading files from Google Drive. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-12 16:44:08 +01:00
vasilije	e6db870264	Add musique adapter base	2025-02-11 17:16:48 -05:00
hajdul88	6a0c0e3ef8	feat: Cognee evaluation framework development (#498 ) <!-- .github/pull_request_template.md --> This PR contains the evaluation framework development for cognee ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Expanded evaluation framework now integrates asynchronous corpus building, question answering, and performance evaluation with adaptive benchmarks for improved metrics (correctness, exact match, and F1 score). - Infrastructure - Added database integration for persistent storage of questions, answers, and metrics. - Launched an interactive metrics dashboard featuring advanced visualizations. - Introduced an automated testing workflow for continuous quality assurance. - Documentation - Updated guidelines for generating concise, clear answers. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-11 16:31:54 +01:00
Boris	f75e35c337	fix: custom model pipeline (#508 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features • Graph visualizations now allow exporting to a user-specified file path for more flexible output management. • The text embedding process has been enhanced with an additional tokenizer option for improved performance. • A new `ExtendableDataPoint` class has been introduced for future extensions. • New JSON files for companies and individuals have been added to facilitate testing and data processing. - Improvements • Search functionality now uses updated identifiers for more reliable content retrieval. • Metadata handling has been streamlined across various classes by removing unnecessary type specifications. • Enhanced serialization of properties in the Neo4j adapter for improved handling of complex structures. • The setup process for databases has been improved with a new asynchronous setup function. - Chores • Dependency and configuration updates improve overall stability and performance. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-08 02:00:15 +01:00
Igor Ilic	5fe7ff9883	refactor: Refactor search so graph completion is used by default (#505 ) <!-- .github/pull_request_template.md --> ## Description Refactor search so query type doesn't need to be provided to make it simpler for new users ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Refactor - Improved the search interface by standardizing parameter usage with explicit keyword arguments for specifying search types, enhancing clarity and consistency. - Tests - Updated test cases and example integrations to align with the revised search parameters, ensuring consistent behavior and reliable validation of search outcomes. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-02-07 17:16:34 +01:00
Vasilije	4d3acc358a	fix: mcp improvements (#472 ) <!-- .github/pull_request_template.md --> ## Description <!-- Provide a clear description of the changes in this PR --> ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Dependency Update - Downgraded `mcp` package version from 1.2.0 to 1.1.3 - Updated `cognee` dependency to include additional features with `cognee[codegraph]` - New Features - Introduced a new tool, "codify", for transforming codebases into knowledge graphs - Enhanced the existing "search" tool to accept a new parameter for search type - Improvements - Streamlined search functionality with a new modular approach - Added new asynchronous function for retrieving and formatting code parts - Documentation - Updated import paths for `SearchType` in various modules and tests to reflect structural changes - Code Cleanup - Removed legacy search module and associated classes/functions - Refined data transfer object classes for consistency and clarity <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>	2025-02-04 08:47:31 +01:00
Igor Ilic	8879f3fbbe	feat: Add gemini support [COG-1023] (#485 ) <!-- .github/pull_request_template.md --> ## Description PR to test Gemini PR from holchan 1. Add Gemini LLM and Gemini Embedding support 2. Fix CodeGraph issue with chunks being bigger than maximum token value 3. Add Tokenizer adapters to CodeGraph ## DCO Affirmation I affirm that all code in every commit of this pull request conforms to the terms of the Topoteretes Developer Certificate of Origin <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added support for the Gemini LLM provider. - Expanded LLM configuration options. - Introduced a new GitHub Actions workflow for multimetric QA evaluation. - Added new environment variables for LLM and embedding configurations across various workflows. - Bug Fixes - Improved error handling in various components. - Updated tokenization and embedding processes. - Removed warning related to missing `dict` method in data items. - Refactor - Simplified token extraction and decoding methods. - Updated tokenizer interfaces. - Removed deprecated dependencies. - Enhanced retry logic and error handling in embedding processes. - Documentation - Updated configuration comments and settings. - Chores - Updated GitHub Actions workflows to accommodate new secrets and environment variables. - Modified evaluation parameters. - Adjusted dependency management for optional libraries. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: holchan <61059652+holchan@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2025-01-31 18:03:23 +01:00
Vasilije	8a50da8ff5	Merge pull request #475 from topoteretes/feat/COG-1060-code-pipeline-endpoints feat: add codegraph related API endpoints	2025-01-28 14:46:52 +01:00
alekszievr	5e076689ad	Feat: [COG-1074] fix multimetric eval bug (#463 ) * feat: make tasks a configurable argument in the cognify function * fix: add data points task * Ugly hack for multi-metric eval bug * some cleanup --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>	2025-01-28 13:05:22 +01:00
Boris Arzentar	3320bc8f2c	feat: add codegraph related API endpoints	2025-01-28 10:08:59 +01:00
alekszievr	4e3a666b33	Feat: Save and load contexts and answers for eval (#462 ) * feat: make tasks a configurable argument in the cognify function * fix: add data points task * eval on random samples instead of first couple * Save and load contexts and answers * Fix random seed usage and handle empty descriptions * include insights search in cognee option * create output dir if doesnt exist --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>	2025-01-22 16:17:01 +01:00
alekszievr	75bc7f67eb	feat: Add incremental eval option to paramset (#446 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage * feat: make tasks a configurable argument in the cognify function * Run eval on a set of parameters and save results as json and png * fix: add data points task * script for running all param combinations * enable context provider to get tasks as param * bugfix in simple rag * Incremental eval of cognee pipeline * potential fix: single asyncio run * temp fix: exclude insights * Remove insights, have single asyncio run, refactor * Include incremental eval in accepted paramsets * minor fixes * handle pipeline slices in utils * Handle insights and customize search types * Handle retrieved edges more safely * bugfix * fix simple rag --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com> Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-01-17 18:04:31 +01:00
alekszievr	2e010f8dd1	Incremental eval of cognee pipeline (#445 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage * feat: make tasks a configurable argument in the cognify function * Run eval on a set of parameters and save results as json and png * fix: add data points task * script for running all param combinations * enable context provider to get tasks as param * bugfix in simple rag * Incremental eval of cognee pipeline * potential fix: single asyncio run * temp fix: exclude insights * Remove insights, have single asyncio run, refactor * minor fixes * handle pipeline slices in utils * include all options in params json --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com> Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2025-01-17 14:16:48 +01:00
alekszievr	8ec1e48ff6	Run eval on a set of parameters and save them as png and json (#443 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage * Run eval on a set of parameters and save results as json and png * script for running all param combinations * bugfix in simple rag * potential fix: single asyncio run * temp fix: exclude insights * Remove insights, have single asyncio run, refactor --------- Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>	2025-01-17 00:18:51 +01:00
alekszievr	3494521cae	Support 4 different rag options in eval (#439 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Support 4 different rag options in eval * Minor refactor and logger usage	2025-01-15 15:34:13 +01:00
alekszievr	6653d73556	Feat/cog 950 improve metric selection (#435 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * restructure metric selection * Add comprehensiveness, diversity and empowerment metrics * add promptfoo as an option * refactor RAG solution in eval;2C * LLM as a judge metrics implemented in a uniform way * Use requests.get instead of wget * clean up promptfoo config template * minor fixes * get promptfoo path instead of hardcoding * minor fixes * Add LLM as a judge prompts * Minor refactor and logger usage	2025-01-15 10:45:55 +01:00
alekszievr	a4ad1702ed	Feat/cog 946 abstract eval dataset (#418 ) * QA eval dataset as argument, with hotpot and 2wikimultihop as options. Json schema validation for datasets. * Load dataset file by filename, outsource utilities * Use requests.get instead of wget	2025-01-14 11:33:55 +01:00
hajdul88	e2ad54d88e	Fix: deleting incorrect repo path	2025-01-10 15:54:45 +01:00
hajdul88	6177d04b44	feat: implements code retreiver	2025-01-10 13:03:34 +01:00
hajdul88	9604d95ba5	feat: adds basic retriever for swe bench	2025-01-09 19:54:58 +01:00
Rita Aleksziev	18bb282fbc	Adjust SWE-bench script to code graph pipeline call	2025-01-09 14:52:02 +01:00
vasilije	76a0aa7e8b	Fix linter issues	2025-01-05 19:48:35 +01:00
vasilije	6dafe73a6b	Fix linter issues	2025-01-05 19:24:55 +01:00
vasilije	649fcf2ba8	Fix linter issues	2025-01-05 19:21:09 +01:00
vasilije	60c8fd103b	ruff format	2025-01-05 19:09:08 +01:00
lxobr	da5e3ab24d	COG 870 Remove duplicate edges from the code graph (#293 ) * feat: turn summarize_code into generator * feat: extract run_code_graph_pipeline, update the pipeline * feat: minimal code graph example * refactor: update argument * refactor: move run_code_graph_pipeline to cognify/code_graph_pipeline * refactor: indentation and whitespace nits * refactor: add deprecated use comments and warnings --------- Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com> Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com> Co-authored-by: Boris <boris@topoteretes.com>	2024-12-17 12:02:25 +01:00
alekszievr	4f2745504c	Calculate official hotpot EM and F1 scores (#292 )	2024-12-10 19:16:12 +01:00
Boris	348610e73c	fix: refactor get_graph_from_model to return nodes and edges correctly (#257 ) * fix: handle rate limit error coming from llm model * fix: fixes lost edges and nodes in get_graph_from_model * fix: fixes database pruning issue in pgvector (#261) * fix: cognee_demo notebook pipeline is not saving summaries --------- Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>	2024-12-06 12:52:01 +01:00
Boris Arzentar	d49ab4c3b5	feat: update code-graph notebook	2024-12-03 23:48:12 +01:00
Boris Arzentar	b89a4b8054	Merge remote-tracking branch 'origin/main' into code-graph	2024-12-03 21:14:19 +01:00
Rita Aleksziev	a0d5102bd8	add some spaces for readability	2024-12-03 17:22:23 +01:00
Rita Aleksziev	0fbb50960b	prompt renaming	2024-12-03 15:59:03 +01:00
Rita Aleksziev	dc082de4c2	minor bugfix in folder creation	2024-12-02 14:54:40 +01:00
Rita Aleksziev	f966f099fc	Prompt renaming to more specific names. Minor code changes.	2024-12-02 12:18:00 +01:00
Boris Arzentar	11acabdb6a	fix: remove duplicate nodes and edges before saving; Fix FalkorDB vector index;	2024-12-02 10:10:18 +01:00
Rita Aleksziev	a4c56f118d	Connect code graph pipeline + retriever + benchmarking	2024-11-29 15:24:49 +01:00
Rita Aleksziev	4da1657140	merge changes from code-graph	2024-11-29 12:16:36 +01:00
Rita Aleksziev	8f241fa6c5	convert edge to string	2024-11-29 12:05:52 +01:00
Leon Luithlen	a5ae9185cd	Replicate PR 33	2024-11-29 11:40:51 +01:00

1 2

81 commits