Commit graph

46 commits

Author SHA1 Message Date
chinu0609
7b31b86f10 fix: reverting the lancedb chnage 2025-10-22 20:55:59 +05:30
chinu0609
b47cb7462d fix: Update code for Ollama API compatibility with newer version 2025-10-22 19:55:45 +05:30
Igor Ilic
5528097e29 Merge branch 'main' into merge-main-vol6 2025-09-27 00:06:33 +02:00
Igor Ilic
4054307b15 refactor: Remove comment 2025-09-25 16:03:11 +02:00
Andrej Milicevic
9b6e1a8f0c test:Add tests for limit=None search 2025-09-23 12:46:51 +02:00
Andrej Milicevic
e3cde238ff refactor: Change limit=0 to limit=None in vector search. Initial commit, still wip. 2025-09-19 12:31:25 +02:00
vasilije
38bbfd42cf added lancedb pandas removal 2025-08-27 19:14:16 +02:00
hajdul88
544e08930b feat: removing invalidValueErrors 2025-08-13 14:42:57 +02:00
Igor Ilic
a75a79f012
Lancedb async lock (#1222)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-08-12 08:46:15 -04:00
Boris
46c4463cb2
feat: s3 storage (#988)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: vasilije <vas.markovic@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-07-14 21:47:08 +02:00
Igor Ilic
456f3b58c0
Mcp test (#980)
<!-- .github/pull_request_template.md -->

## Description
Add test of MCP functionality and starting of MCP server, fix some MCP and LanceDB
issues

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-06-13 07:52:48 -04:00
Daniel Molnar
b5ebed1f7d
Docstring infrastructure. (#880)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-05-28 17:47:31 +02:00
Boris
0f3522eea6
fix: cognee docker image (#820)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-05-15 10:05:27 +02:00
Igor Ilic
9c131f0d14
refactor: Update lanceDB and change delete to work async (#770)
<!-- .github/pull_request_template.md -->

## Description
Update LanceDB and rewrite data points to run async

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Boris <boris@topoteretes.com>
Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2025-05-12 17:35:24 +02:00
Boris
cd9c4897a4
feat: remove get_distance_from_collection_names and adapt search (#766)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-04-30 11:11:07 +02:00
Vasilije
bb7eaa017b
feat: Group DataPoints into NodeSets (#680)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2025-04-19 20:21:04 +02:00
Boris
675b66175f
test: make search unit tests deterministic (#726)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: Daniel Molnar <soobrosa@gmail.com>
2025-04-18 21:55:24 +02:00
Daniel Molnar
9ba12b25ef
feat: add delete by document (#668)
<!-- .github/pull_request_template.md -->

## Description
Delete by document.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-04-17 15:42:10 +02:00
Boris
f9e6dcf837
fix: simplify code pipeline (#529)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
  - Enhanced code search and dependency analysis for improved accuracy.
  - Introduced a new high-performance text embedding option.
  - Added an additional execution entry point for code graph processing.
- New optional parameters for flexible property selection in retrieval
functions.
- Introduced new classes for handling import statements, function
definitions, and class definitions.
  - Updated embedding engine selection based on configuration options.

- **Bug Fixes**
- Improved error handling in search operations and database queries for
a more stable user experience.
  - Enhanced error logging for source code parsing.

- **Refactor**
- Streamlined asynchronous processing and refactored internal dependency
extraction.
- Updated configuration and integration settings to enhance overall
reliability.
  - Restructured functions for simplified dependency handling.

- **Chores**
- Upgraded and reorganized dependency management with optional libraries
for extended functionality.
- Added new secret parameters for embedding configuration in workflow
settings.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: vasilije <vas.markovic@gmail.com>
2025-02-12 23:58:48 +01:00
Boris
8f84713b54
fix: support structured data conversion to data points (#512)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- New Features
- Introduced version tracking and enhanced metadata in core data models
for improved data consistency.
  
- Bug Fixes
- Improved error handling during graph data loading to prevent
disruptions from unexpected identifier formats.
  
- Refactor
- Centralized identifier parsing and streamlined model definitions,
ensuring smoother and more consistent operations across search,
retrieval, and indexing workflows.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-10 17:16:13 +01:00
Boris
f75e35c337
fix: custom model pipeline (#508)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
• Graph visualizations now allow exporting to a user-specified file path
for more flexible output management.
• The text embedding process has been enhanced with an additional
tokenizer option for improved performance.
• A new `ExtendableDataPoint` class has been introduced for future
extensions.
• New JSON files for companies and individuals have been added to
facilitate testing and data processing.

- **Improvements**
• Search functionality now uses updated identifiers for more reliable
content retrieval.
• Metadata handling has been streamlined across various classes by
removing unnecessary type specifications.
• Enhanced serialization of properties in the Neo4j adapter for improved
handling of complex structures.
• The setup process for databases has been improved with a new
asynchronous setup function.

- **Chores**
• Dependency and configuration updates improve overall stability and
performance.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-08 02:00:15 +01:00
hajdul88
935763b08d fix: fixing changed lancedb search + pruning 2025-01-16 17:32:44 +01:00
vasilije
60c8fd103b ruff format 2025-01-05 19:09:08 +01:00
alekszievr
bfa0f06fb4
Add type to DataPoint metadata (#364)
* Add type to DataPoint metadata

* Add missing index_fields

* Use DataPoint UUID type in pgvector create_data_points

* Make _metadata mandatory everywhere
2024-12-16 16:27:03 +01:00
Boris Arzentar
0b8b270933 fix: make get_embeddable_data static 2024-12-03 21:47:23 +01:00
Boris Arzentar
27416afed0 fix: lancedb batch merge 2024-12-03 21:13:50 +01:00
Boris Arzentar
e07364fc25 Merge remote-tracking branch 'origin/main' into code-graph 2024-12-03 12:44:57 +01:00
Igor Ilic
343ac47fd4 fix: Update import location for LanceDB
Updated import path for LanceDB exceptions

Fix COG-502
2024-12-02 13:19:55 +01:00
Igor Ilic
04960eeb4e Merge branch 'main' of github.com:topoteretes/cognee-private into COG-502-backend-error-handling 2024-12-02 13:12:20 +01:00
Boris
6403d15a76
fix: enable falkordb and add test for it (#31) 2024-11-27 22:55:30 +01:00
Boris Arzentar
d885a047ac Merge remote-tracking branch 'origin/main' into code-graph 2024-11-27 22:54:49 +01:00
Igor Ilic
ae568409a7 feat: Add custom exceptions to cognee lib
Added use of custom exceptions to cognee lib
2024-11-27 14:29:33 +01:00
hajdul88
3146ef75c9 Fix: renames new vector db and cogneegraph methods 2024-11-27 13:47:26 +01:00
Boris
64b8aac86f
feat: code graph swe integration
Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
Co-authored-by: hande-k <handekafkas7@gmail.com>
Co-authored-by: Igor Ilic <igorilic03@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2024-11-27 09:32:29 +01:00
hajdul88
163bdc527c chore: fixes PR issues regarding vector normalization and cognee graph 2024-11-26 15:37:34 +01:00
hajdul88
44ac9b68b4 feat: adds get_distances from collection method to LanceDB and PgVector 2024-11-19 16:39:45 +01:00
hajdul88
e988a67466 Fixes LanceDB datapoint add 2024-11-11 19:28:17 +01:00
Boris
52180eb6b5
feat: COG-184 add falkordb (#192)
* feat: add falkordb adapter

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2024-11-11 18:20:52 +01:00
Igor Ilic
3567e0d7e7 fix: Fix chunk naive llm classifier
Fixed chunk naive llm classifier uuid issue, added fix for deletion of data points for LanceDB

Fix #COG-472
2024-10-31 00:42:18 +01:00
Igor Ilic
dc46304a8d fix: Add missing await statement to LanceDBAdapter and PGVectorAdapter
Added missing await statement to batch search for LanceDB and PGVector adapters

Fix #COG-170
2024-10-22 15:15:45 +02:00
Boris
26bca0184f
feat: add entity and entity type nodes to vector db (#126)
* feat: add entity and entity type nodes to vector db

* fix: use uuid5 as entity ids

* fix: id -> uuid and LanceDB collection model
2024-08-01 14:21:39 +02:00
Boris
14555a25d0
feat: pipelines and tasks (#119)
* feat: simple graph pipeline

* feat: implement incremental graph generation

* fix: various bug fixes

* fix: upgrade weaviate-client

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2024-07-20 16:49:00 +02:00
Boris Arzentar
84c0c8cab5 feat: add llm config 2024-05-22 22:36:30 +02:00
Vasilije
db0b19bb30 async embeddings fix + processing fix + decouple issues with the lancedb 2024-05-17 11:13:39 +02:00
Boris
219afbce68
feat: add lancedb vector storage [COG-176] (#90)
* feat: integrate lancedb

* fix: use futures in weaviate adapter to enable async behaviour
2024-05-03 10:35:41 +02:00
Vasilije
212e5dcf78
Cog 174 (#84)
* Add telemetry

* test: add github action test

* fix: create graph only once

* fix: handle graph file not existing while deleting it

* fix: close qdrant connection in methods

---------

Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
2024-04-26 00:16:03 +02:00