Commit graph

2260 commits

Author SHA1 Message Date
alekszievr
8dd575e004
chore: move ec2 setup file and remove extra steps [cog-1585] (#653)
<!-- .github/pull_request_template.md -->

## Description
This .sh file can be used for EC2 deployment as explained in
https://github.com/topoteretes/cognee-docs/pull/58

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Removed outdated guidance for setting up evaluation environments,
streamlining the visible instructions.

- **Chores**
- Updated the Ubuntu setup process to install Python 3.12, ensuring the
virtual environment uses the latest version and enhancing overall
performance.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-19 15:02:55 +01:00
hajdul88
1c65682242
feat: adds cypher search to retrievers module (#648)
<!-- .github/pull_request_template.md -->

## Description
Exposes the query method of the adapter in the search interface for Kuzu
and Neo4j (cypher compatible adapters)

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new cypher-based search option that expands the app's
search functionality.
  - Enabled asynchronous processing for advanced query execution.
- Enhanced error messaging for unsupported search types and query
execution issues.
- Added a new enumeration value for `CYPHER` to support the new search
type.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-19 15:01:40 +01:00
hajdul88
24e0805f50
chore: deletes error log when there is no collection. Using dynamic c… (#651)
…ollection handling its not an error

<!-- .github/pull_request_template.md -->

## Description
Deletes error logging from ChromaDB adapter

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Updated internal error handling to ensure more consistent responses
during unforeseen issues. This change streamlines the system’s approach
to managing errors, reducing unnecessary internal error logs while
maintaining reliable operations and a stable user experience. These
refinements contribute to improved system stability and efficient error
management. Internal operations are now better optimized to handle
unexpected scenarios gracefully.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-18 11:17:23 +01:00
alekszievr
219b68c6b0
chore: Remove old eval files [cog-1567] (#649)
<!-- .github/pull_request_template.md -->

## Description
Removed old, unused eval files. 
- swe-bench eval files are kept here as swe-bench eval is not handled by
the new eval framework
- EC2_readme and cloud/setup_ubuntu_instance.sh will be removed (and
moved to the docs website) as part of another task

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-03-17 19:19:39 +01:00
Igor Ilic
9b9fe48843
chore: Temporarily remove embedding env vars for code graph action (#647)
<!-- .github/pull_request_template.md -->

## Description
Temporarily remove embedding env variables for code graph action so the
action can run

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Removed legacy secret configuration from the testing workflow to
streamline the CI process and enhance maintainability.
- **Improvements**
  - Updated the argument name in the code graph pipeline for clarity.
- Enhanced the handling of results in the example script to support
asynchronous processing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-17 14:58:03 +01:00
lxobr
cad9e0ce44
Feat: cog 1491 pipeline steps in eval (#641)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Created get_default_tasks_by_indices to filter default tasks by
specific indices
- Added get_no_summary_tasks function to skip summarization tasks
- Added get_just_chunks_tasks function for chunk extraction and data
points only
- Added NO_SUMMARIES and JUST_CHUNKS to the TaskGetters enum
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- The evaluation configuration now includes expanded task retrieval
options. Users can choose customized modes that bypass summarization or
focus solely on extracting data chunks, offering a more tailored
evaluation experience.
- Enhanced asynchronous task processing brings increased flexibility and
smoother performance during task selection.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-14 14:20:39 +01:00
hajdul88
f206edb83c
fix: fixes naming in chromadb test (#640)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Tests**
- Revised test logic to align with updated conventions for retrieving
database entities, ensuring accurate verification of database state.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-13 18:01:04 +01:00
Daniel Molnar
69950a04dd
feat: Kuzu integration (#628)
<!-- .github/pull_request_template.md -->

## Description
Let's scope it out.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced support for the Kuzu graph database provider, enhancing
graph operations and data management capabilities.
- Added a comprehensive adapter for Kuzu, facilitating various graph
database operations.
  - Expanded the enumeration of graph database types to include Kuzu.

- **Tests**
- Launched comprehensive asynchronous tests to validate the new Kuzu
graph integration’s performance and reliability.

- **Chores**
- Updated dependency settings and continuous integration workflows to
include the Kuzu provider, ensuring smoother deployments and improved
system quality.
- Enhanced configuration documentation to clarify Kuzu database
requirements.
  - Modified Dockerfile to include Kuzu in the installation extras.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-03-13 17:47:09 +01:00
Dmitrii Galkin
e147fa5bde
feat: Add support for ChromaDB (#622)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

# Add Support for ChromaDB

## Summary
This PR adds support for ChromaDB as a vector database option in the
Cognee application. ChromaDB is a modern, open-source embedding database
designed for AI applications.

## Changes
- Created a new ChromaDBAdapter implementation for vector database
operations
- Added comprehensive test suite for ChromaDB functionality
- Updated docker-compose.yml to include ChromaDB service
- Modified environment configuration to support ChromaDB settings
- Updated vector engine creation logic to support ChromaDB as an option

## Technical Details
- Implemented `ChromaDBAdapter.py` (347 lines) with full CRUD operations
for vector data
- Created test suite (`test_chromadb.py`) with 171 lines of test
coverage
- Updated vector engine creation process to dynamically select ChromaDB
when configured
- Modified settings router to accommodate new database option
- Updated environment template with ChromaDB configuration options

## Docker Changes
- Added ChromaDB service to docker-compose.yml with appropriate
configuration

This PR enhances Cognee's flexibility by providing an alternative vector
database option, allowing users to choose the most appropriate database
for their specific use case.



## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin

Tested with UI + tests.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Expanded vector database integration by adding support for Chromadb,
enabling enhanced data management and search functionalities.
- **Tests**
- Added automated tests to validate the Chromadb integration and related
operations.
- **Chores**
- Updated configuration guidance and dependency management to include
Chromadb.
  - Provided an optional container deployment template for Chromadb.
- Added a new entry to ignore the `.chromadb_data/` directory in version
control.
- Introduced a new GitHub Actions workflow for testing Chromadb
integration.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-03-13 15:13:04 +01:00
lxobr
daf7d4ae26
feat: COG-1526 instance filter in eval (#627)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Added _filter_instances to BaseBenchmarkAdapter supporting filtering
by IDs, indices, or JSON files.
- Updated HotpotQAAdapter and MusiqueQAAdapter to use the base class
filtering.
- Added instance_filter parameter to corpus builder pipeline.
- Extracted _get_raw_corpus method in both adapters for better code
organization
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Corpus loading and building now support a flexible filtering option,
allowing users to apply custom criteria to tailor the retrieved data.

- **Refactor**
- The extraction process has been reorganized to separately handle text
content and associated metadata, enhancing clarity and overall workflow
efficiency.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-13 14:23:13 +01:00
Igor Ilic
88ed411f03
feat: user authorization [COG-1189] (#593)
<!-- .github/pull_request_template.md -->

## Description
Added user authorization through JWT header, reworked user and relevant
RBAC models to accompany future User Permission system.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
  - Introduced an automated workflow to validate server startup.
  - Added secure JWT token generation for improved session handling.
- Enabled a new structure for permission management with role and
tenant-based controls, including endpoints for creating roles, tenants,
and assigning permissions.
- Added methods for assigning default permissions to roles, tenants, and
users.
- Introduced new classes for managing default permissions for roles,
tenants, and users.

- **Refactor**
- Streamlined authentication and user management flows with enhanced
error handling.

- **Tests**
- Upgraded integration tests with improved database initialization and
data pruning for a more stable environment.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-03-13 13:33:42 +01:00
lxobr
38d527ceac
fix: expose chunk_size for eval framework [COG-1546] (#634)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Exposed chunk_size in get_default_tasks in cognify
- Reintegrated chunk_size in corpus building in eval framework
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced an optional configuration parameter to allow users to set
custom processing segment sizes. This enhances flexibility in managing
content processing and task execution, enabling more dynamic control
over resource handling during corpus creation and related operations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-12 16:13:20 +01:00
hajdul88
6fcfb3c398
feat: productionizing ontology solution [COG-1401] (#623)
<!-- .github/pull_request_template.md -->

## Description
This PR contains the ontology feature integrated into cognify

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced ontology management with the introduction of the
`OntologyResolver` class for improved data handling and querying.
- Expanded ontology framework now provides enriched coverage of
technology and automotive domains, including new entities and
relationships.
- Updated entity models now include a validation flag to support
improved data integrity.
- Added support for specifying an ontology file path in relevant
functions to enhance flexibility.

- **Refactor**
- Streamlined integration of ontology processing across data extraction
and workflow routines.

- **Chores**
- Updated project dependencies to include `owlready2` for advanced
ontology functionality.
  
- **Tests**
- Introduced a new test suite for the `OntologyResolver` class to
validate its functionality under various conditions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-12 14:31:19 +01:00
alekszievr
c1f7b667d1
feat: Eliminate the use of max_chunk_tokens and use a unified max_chunk_size instead [cog-1381] (#626)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Simplified text processing by unifying multiple size-related
parameters into a single metric across chunking and extraction
functionalities.
- Streamlined logic for text segmentation by removing redundant
calculations and checks, resulting in a more consistent chunk management
process.
- **Chores**
  - Removed the `modal` package as a dependency.
- **Documentation**
- Updated the README.md to include a new demo video link and clarified
default environment variable settings.
- Enhanced the CONTRIBUTING.md to improve clarity and engagement for
potential contributors.
- **Bug Fixes**
- Improved handling of sentence-ending punctuation in text processing to
include additional characters.
- **Version Update**
  - Updated project version to 0.1.33 in the pyproject.toml file.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-12 14:03:41 +01:00
Boris Arzentar
b78d9f196f version: v0.1.34 2025-03-11 22:19:35 +01:00
Boris Arzentar
c775716ec3 Merge remote-tracking branch 'origin/dev' into dev 2025-03-11 22:17:16 +01:00
Boris Arzentar
d213945bc6 fix: update mcp dependency 2025-03-11 22:17:07 +01:00
Hande
24a79414ee
chore: update dependencies (#630)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Refined dependency version specifications to allow smoother minor
updates while enhancing compatibility.
- Introduced conditional configurations for improved
environment-specific stability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Boris <boris@topoteretes.com>
2025-03-11 22:16:40 +01:00
Boris Arzentar
3b645ecd9b Merge remote-tracking branch 'origin/dev' into dev 2025-03-11 19:24:41 +01:00
Boris Arzentar
244eb32214 fix: Dockerfile 2025-03-11 19:24:14 +01:00
vasilije
816d99309d Merge branch 'dev' of github.com:topoteretes/cognee into dev 2025-03-11 10:47:34 -07:00
vasilije
ce01945e6e minor cleanup 2025-03-11 10:46:58 -07:00
Daniel Molnar
68b337f0b6
Cline for VSCode demo runs. (#631)
<!-- .github/pull_request_template.md -->

## Description
Missing dependency.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enabled PostgreSQL integration, expanding support for additional
database options and enhancing overall functionality.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-11 18:44:56 +01:00
Boris Arzentar
4719b82c56 fix: don't compile python to bytecode in Dockerfile 2025-03-11 18:34:35 +01:00
Boris Arzentar
d5d01109a2 fix: use new Dockerfile for mcp server 2025-03-11 18:02:43 +01:00
Boris Arzentar
2e4aab9a9a fix: example ruff errors 2025-03-11 16:44:00 +01:00
Boris Arzentar
40c0015f0d chore: update uv.lock 2025-03-11 16:43:22 +01:00
Boris Arzentar
deb3e0cce1 version: v0.1.33 2025-03-11 16:41:38 +01:00
Boris Arzentar
3f69234776 fix: remove double install step from Dockerfile 2025-03-11 16:41:12 +01:00
Vasilije
a74c96609f
Update CONTRIBUTING.md 2025-03-11 03:07:25 +01:00
Vasilije
1d4d54c1f5
Update CONTRIBUTING.md 2025-03-11 03:01:06 +01:00
Daniel Molnar
819e411149
Small clarifications. (#624)
<!-- .github/pull_request_template.md -->

## Description
Small clarifications in README.md.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
- Updated documentation to feature a single, centrally positioned demo
link for clearer navigation.
- Clarified setup instructions to indicate that default configurations
are applied when custom environment variables are not provided.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-10 16:07:36 +01:00
alekszievr
7b5bd7897f
Feat: evaluate retrieved context against golden context [cog-1481] (#619)
<!-- .github/pull_request_template.md -->

## Description
- Compare retrieved context to golden context using deepeval's
summarization metric
- Display relevant fields to each metric on metrics dashboard

Example output:

![image](https://github.com/user-attachments/assets/9facf716-b2ab-4573-bfdf-7b343d2a57c5)


## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced context handling in answer generation and corpus building to
include extended details.
- Introduced a new context coverage metric for deeper evaluation
insights.
- Upgraded the evaluation dashboard with dynamic presentation of metric
details.
- Added a new parameter to support loading golden context in corpus
loading methods.

- **Bug Fixes**
- Improved clarity in how answers are structured and appended in the
answer generation process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-10 15:27:48 +01:00
lxobr
ac0156514d
feat: COG-1523 add top_k in run_question_answering (#625)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Expose top_k as an optional argument of run_question_answering
- Update retrievers to handle the parameters

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced answer generation and document retrieval capabilities by
introducing an optional parameter that allows users to specify the
number of top results. This improvement adds flexibility when retrieving
question responses and associated context, adapting the output based on
user preference.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-10 10:55:31 +01:00
hibajamal
56427f287e
Demo for relational db with cognee (#620)
<!-- .github/pull_request_template.md -->

## Description
This demo uses pydantic models and dlt to pull data from the Pokémon API
and structure it into a relational format. By feeding this structured
data into cognee, it makes searching across multiple tables easier and
more intuitive, thanks to the relational model.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive Pokémon data processing pipeline, available
as both a Python script and an interactive Jupyter Notebook.
- Enabled asynchronous operations for efficient data collection and
querying, including an integrated search functionality.
- Improved error handling and data validation during the data fetching
and processing stages for a smoother user experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-03-08 20:33:42 +01:00
Vasilije
62c84dde5e
feat: added helm clean push (#606)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a Helm-based deployment package that streamlines setup for
the backend application and PostgreSQL database on Kubernetes.
- Added orchestration support via Docker Compose for managing
multi-container deployments.
- Added new Kubernetes resources including Deployments, Services, and
PersistentVolumeClaims for both the backend and PostgreSQL.

- **Documentation**
- Provided comprehensive infrastructure and deployment instructions for
Kubernetes environments.

- **Chores**
- Established a standardized container build process for the Python
application.
- Introduced configuration settings for service ports, resource limits,
and environment variables.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Daniel Molnar <soobrosa@gmail.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-03-08 08:51:57 -08:00
Hande
65d0f7317c
replace video in readme (#617)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
- Updated the demo reference in the documentation by replacing an
embedded video thumbnail with a simplified "Learn about cognee" text
link.
  
- **Chores**
- Integrated a minor internal update to align a related data component
with the latest project state, with no visible impact on functionality.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-07 17:01:38 +01:00
Vasilije
c204b9cbf3
Update README.md 2025-03-07 03:44:31 +01:00
Vasilije
ccfbf306b6
Update README.md 2025-03-07 03:43:54 +01:00
Hande
1b5c059b8d
chore: update litellm (#613)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Adjusted a dependency version range for improved compatibility with
newer releases.
- Enhanced dependency management workflow by integrating Poetry and
adding a commit step for tracking changes.
- Updated Python version in the workflow to 3.12 and improved repository
checkout steps.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: vasilije <vas.markovic@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-03-07 00:17:06 +01:00
Hande
6ca40355ad
chore: update readme (#614)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
- Enhanced image presentation by adjusting display widths for improved
clarity.
- Refined configuration instructions with minor formatting corrections
for better readability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-06 21:58:20 +01:00
Boris
5345626e6a
fix: add proper node labels (#607)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Improved backend data organization with automatic categorization of
stored items for enhanced search and retrieval.
- Launched a product recommendation system that analyzes customer data
and preferences to suggest top products.
- Introduced a sample dataset showcasing customer profiles, preferences,
and product interactions for demonstration purposes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-06 13:30:13 +01:00
lxobr
ea5b11a3b4
feat: add regex entity extractor (#605)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
- Created a new RegexEntityExtractor that uses regex patterns to
identify entities like emails, URLs, and dates in text
- Implemented a JSON-based configuration system to add or modify entity
types without changing code
- Built a separate RegexEntityConfig class to handle loading and
processing of entity configurations
- Added test suite covering all entity types and edge cases
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new regex-based extraction capability that uses
configurable patterns and description templates to identify common
entities such as emails, phone numbers, URLs, dates, and more.
- **Tests**
- Added comprehensive tests to validate the extraction functionality
across standard scenarios and edge cases for reliable text analysis.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-06 12:13:59 +01:00
vasilije
9d783675e0 Revert "First AI pass at layered graph builder"
This reverts commit 1cbcbbd55a.
2025-03-05 19:48:53 -08:00
vasilije
8f5070a1b5 added commit 2025-03-05 19:48:15 -08:00
vasilije
a6b9a6444b Layered graph test 2025-03-05 19:38:04 -08:00
vasilije
1cbcbbd55a First AI pass at layered graph builder 2025-03-05 19:37:45 -08:00
Vasilije
52baed8ff4
bug: added fix and added my pat (#597)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Introduced an automated process that routinely updates project
dependencies. This enhancement minimizes manual maintenance and helps
ensure optimal system stability and security for users.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-03-06 03:39:31 +01:00
Vasilije
d6cc63db8f
Update README.md (#609)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-03-06 03:14:06 +01:00
alekszievr
433264d4e4
feat: Add context evaluation to eval framework [COG-1366] (#586)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a class-based retrieval mechanism to enhance answer
generation with improved context extraction and completion.
- Added a new evaluation metric for contextual relevancy and an option
to enable context evaluation during the evaluation process.

- **Refactor**
- Transitioned from a function-based answer resolver to a more modular
retriever approach to improve extensibility.

- **Tests**
- Updated tests to align with the new answer generation and evaluation
process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Daniel Molnar <soobrosa@gmail.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-03-05 16:40:24 +01:00