Commit graph

33 commits

Author SHA1 Message Date
Pavel Zorin
fb4796204a Chore: Fix helm chart 2026-01-09 18:06:08 +01:00
Pavel Zorin
962ddf4257 Chore: pre-commit, pre-commit action, contribution guide update 2026-01-08 19:19:07 +01:00
Daulet Amirkhanov
f58ba86e7c
feat: add welcome tutorial notebook for new users (#1425)
<!-- .github/pull_request_template.md -->

## Description
<!-- 
Please provide a clear, human-generated description of the changes in
this PR.
DO NOT use AI-generated descriptions. We want to understand your thought
process and reasoning.
-->

Update default tutorial:
1. Use tutorial from [notebook_tutorial
branch](https://github.com/topoteretes/cognee/blob/notebook_tutorial/notebooks/tutorial.ipynb),
specifically - it's .zip version with all necessary data files
2. Use Jupyter Notebook `Notebook` abstractions to read, and map `ipynb`
into our Notebook model
3. Dynamically update starter notebook code blocks that reference
starter data files, and swap them with local paths to downloaded copies
4. Test coverage



| Before | After (storage backend = local) | After (s3) |
|--------|---------------------------------|------------|
| <img width="613" height="546" alt="Screenshot 2025-09-17 at 01 00 58"
src="https://github.com/user-attachments/assets/20b59021-96c1-4a83-977f-e064324bd758"
/> | <img width="1480" height="262" alt="Screenshot 2025-09-18 at 13 01
57"
src="https://github.com/user-attachments/assets/bd56ea78-7c6a-42e3-ae3f-4157da231b2d"
/> | <img width="1485" height="307" alt="Screenshot 2025-09-18 at 12 56
08"
src="https://github.com/user-attachments/assets/248ae720-4c78-445a-ba8b-8a2991ed3f80"
/> |



## File Replacements

### S3 Demo  

https://github.com/user-attachments/assets/bd46eec9-ef77-4f69-9ef0-e7d1612ff9b3

---

### Local FS Demo  

https://github.com/user-attachments/assets/8251cea0-81b3-4cac-a968-9576c358f334


## Type of Change
<!-- Please check the relevant option -->
- [ ] Bug fix (non-breaking change that fixes an issue)
- [x] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Code refactoring
- [ ] Performance improvement
- [ ] Other (please specify):

## Changes Made
<!-- List the specific changes made in this PR -->
- 
- 
- 

## Testing
<!-- Describe how you tested your changes -->

## Screenshots/Videos (if applicable)
<!-- Add screenshots or videos to help explain your changes -->

## Pre-submission Checklist
<!-- Please check all boxes that apply before submitting your PR -->
- [ ] **I have tested my changes thoroughly before submitting this PR**
- [ ] **This PR contains minimal changes necessary to address the
issue/feature**
- [ ] My code follows the project's coding standards and style
guidelines
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have added necessary documentation (if applicable)
- [ ] All new and existing tests pass
- [ ] I have searched existing PRs to ensure this change hasn't been
submitted already
- [ ] I have linked any relevant issues in the description
- [ ] My commits have clear and descriptive messages

## Related Issues
<!-- Link any related issues using "Fixes #issue_number" or "Relates to
#issue_number" -->

## Additional Notes
<!-- Add any additional notes, concerns, or context for reviewers -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-09-18 18:07:05 +02:00
Igor Ilic
bc67eb9651
Regen lock files (#1153)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-07-25 11:45:28 -04:00
Igor Ilic
31809d98df
feat: Fix python312 issue on main (#1011)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: vasilije <vas.markovic@gmail.com>
2025-06-21 09:49:03 +02:00
Igor Ilic
2611d89094
feat: Add logging to file [COG-1715] (#672)
<!-- .github/pull_request_template.md -->

## Description
Add logging to logs file

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-03-28 16:13:56 +01:00
Dmitrii Galkin
e147fa5bde
feat: Add support for ChromaDB (#622)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

# Add Support for ChromaDB

## Summary
This PR adds support for ChromaDB as a vector database option in the
Cognee application. ChromaDB is a modern, open-source embedding database
designed for AI applications.

## Changes
- Created a new ChromaDBAdapter implementation for vector database
operations
- Added comprehensive test suite for ChromaDB functionality
- Updated docker-compose.yml to include ChromaDB service
- Modified environment configuration to support ChromaDB settings
- Updated vector engine creation logic to support ChromaDB as an option

## Technical Details
- Implemented `ChromaDBAdapter.py` (347 lines) with full CRUD operations
for vector data
- Created test suite (`test_chromadb.py`) with 171 lines of test
coverage
- Updated vector engine creation process to dynamically select ChromaDB
when configured
- Modified settings router to accommodate new database option
- Updated environment template with ChromaDB configuration options

## Docker Changes
- Added ChromaDB service to docker-compose.yml with appropriate
configuration

This PR enhances Cognee's flexibility by providing an alternative vector
database option, allowing users to choose the most appropriate database
for their specific use case.



## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin

Tested with UI + tests.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Expanded vector database integration by adding support for Chromadb,
enabling enhanced data management and search functionalities.
- **Tests**
- Added automated tests to validate the Chromadb integration and related
operations.
- **Chores**
- Updated configuration guidance and dependency management to include
Chromadb.
  - Provided an optional container deployment template for Chromadb.
- Added a new entry to ignore the `.chromadb_data/` directory in version
control.
- Introduced a new GitHub Actions workflow for testing Chromadb
integration.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-03-13 15:13:04 +01:00
Boris
f75e35c337
fix: custom model pipeline (#508)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
• Graph visualizations now allow exporting to a user-specified file path
for more flexible output management.
• The text embedding process has been enhanced with an additional
tokenizer option for improved performance.
• A new `ExtendableDataPoint` class has been introduced for future
extensions.
• New JSON files for companies and individuals have been added to
facilitate testing and data processing.

- **Improvements**
• Search functionality now uses updated identifiers for more reliable
content retrieval.
• Metadata handling has been streamlined across various classes by
removing unnecessary type specifications.
• Enhanced serialization of properties in the Neo4j adapter for improved
handling of complex structures.
• The setup process for databases has been improved with a new
asynchronous setup function.

- **Chores**
• Dependency and configuration updates improve overall stability and
performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-08 02:00:15 +01:00
Boris Arzentar
b89a4b8054 Merge remote-tracking branch 'origin/main' into code-graph 2024-12-03 21:14:19 +01:00
Boris Arzentar
d6f0d65b63 Merge remote-tracking branch 'origin/code-graph' 2024-12-01 11:51:54 +01:00
Rita Aleksziev
a4c56f118d Connect code graph pipeline + retriever + benchmarking 2024-11-29 15:24:49 +01:00
Rita Aleksziev
f47b185a9e feat/add correctness score calculation with LLM as a judge 2024-11-27 10:53:48 +01:00
0xideas
80b06c3acb
test: Test for code graph enrichment task
Co-authored-by: lxobr <lazar@topoteretes.com>
2024-11-24 19:24:47 +01:00
Leon Luithlen
e0e93ae379 Clean up notebook merge request 2024-11-12 09:04:43 +01:00
Leon Luithlen
9fe1b6c5fa Add code_graph_demo notebook 2024-11-12 09:01:03 +01:00
Igor Ilic
33c3748d1e refactor: Renamed .anonymous_id file to anon_id
Renamed .anonymous_id file to anon_id

Refactor #COG-492
2024-11-11 11:53:09 +01:00
Boris
2f832b190c
fix: various fixes for the deployment
* fix: remove groups from UserRead model

* fix: add missing system dependencies for postgres

* fix: change vector db provider environment variable name

* fix: WeaviateAdapter retrieve bug

* fix: correctly return data point objects from retrieve method

* fix: align graph object properties

* feat: add node example
2024-10-22 11:26:48 +02:00
Boris
dc187a81d7
feat: migrate search to tasks (#144)
* fix: don't return anything on health endpoint

* feat: add alembic migrations

* feat: align search types with the data we store and migrate search to tasks
2024-10-07 14:41:35 +02:00
Vasilije
c9b2a06dff rewrote configs 2024-06-10 13:40:05 +02:00
Vasilije
460583a40f added gitignore updates 2024-06-02 18:08:29 +02:00
Boris Arzentar
84c0c8cab5 feat: add llm config 2024-05-22 22:36:30 +02:00
Boris Arzentar
1ac28f4cb8 feat: add initial cognee frontend 2024-05-17 13:42:14 +02:00
Boris
219afbce68
feat: add lancedb vector storage [COG-176] (#90)
* feat: integrate lancedb

* fix: use futures in weaviate adapter to enable async behaviour
2024-05-03 10:35:41 +02:00
Boris Arzentar
370b74988e chore: add functions to improve user experience 2024-03-30 15:25:34 +01:00
Boris Arzentar
8d4be049f4 feat: add support for text and file in cognee.add 2024-03-29 13:53:59 +01:00
Boris Arzentar
2a7a545dcc fix: remove unnecessary files 2024-03-13 17:28:52 +01:00
Boris Arzentar
d5391f903c chore: rename package in files 2024-03-13 16:27:07 +01:00
Boris Arzentar
260a21fc22 Merge remote-tracking branch 'origin/feat/COG-24-add-qdrant' into feat/COG-24-add-qdrant 2024-03-12 20:55:31 +01:00
Boris Arzentar
769d6b5080 feat: add create-memory and remember API endpoints
Add possibility to create a new Vector memory and store text data points using openai embeddings.
2024-02-25 23:56:50 +01:00
Boris Arzentar
47c3463406 chore: add debugpy and update readme 2024-02-15 10:13:19 +01:00
Vasilije
9d87eb3c23
Merge branch 'main' into code_review 2023-08-25 12:12:46 +02:00
Vasilije
a746739d32 added first pass of the code review 2023-08-24 17:53:53 +02:00
burnash
3e654dfd51 Add Python.gitignore 2023-08-17 10:43:16 +02:00