Commit graph

2174 commits

Author SHA1 Message Date
Boris
9a1e03e403
fix: simplify installation in readme (#577)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
	- Enhanced overall clarity and layout of the guide.
- Updated text alignment and visual elements, including an updated logo.
	- Revised header hierarchy for a more intuitive reading experience.
- Added detailed installation instructions with specific database
support.
- Reorganized contributing guidelines and the code of conduct for
improved structure.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-24 20:36:22 +01:00
Igor Ilic
4f354ba534
fix: reuse PostgreSQL database connections (#574)
<!-- .github/pull_request_template.md -->

## Description
Fix PostgreSQL database connection problems

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Improved the system’s database connection process to enhance
compatibility across multiple relational databases. The application now
dynamically selects the optimal connection method—reusing established
connections when possible—to ensure improved stability and performance
without affecting the public interface.
- Streamlined the creation of the embedding engine by removing it as a
parameter and generating it internally.
- Removed dependency on the embedding engine in the vector engine
retrieval process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-24 20:35:40 +01:00
Vasilije
6e567445b5
Update README.md 2025-02-21 18:51:24 +01:00
alekszievr
a61df966c6
feat: use external chunker [cog-1354] (#551)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a modular content chunking interface that offers flexible
text segmentation with configurable chunk size and overlap.
- Added new chunkers for enhanced text processing, including
`LangchainChunker` and improved `TextChunker`.

- **Refactor**
- Unified the chunk extraction mechanism across various document types
for improved consistency and type safety.
- Updated method signatures to enhance clarity and type safety regarding
chunker usage.
- Enhanced error handling and logging during text segmentation to guide
adjustments when content exceeds limits.

- **Bug Fixes**
- Adjusted expected output in tests to reflect changes in chunking logic
and configurations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-21 14:10:59 +01:00
hajdul88
eba1515127
feat: quick fix dynamic collection handling in search (#567) [COG-1369]
<!-- .github/pull_request_template.md -->

## Description
Fixes search dynamic collection mapping in graph completion search

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Adjusted graph processing to remove extraneous notifications when
expected data elements are absent.
- Updated query processing to ensure a more consistent selection of
related data types.
- Streamlined database error handling by aligning exception management
with standard practices.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-21 13:45:42 +01:00
SJ
fd3b15fb58
fix: entrypoint.sh to not fail on first docker up, improved handling of migrations, signals and errors. (#546)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
In it's current form, the entrypoint.sh script will run but fail with
exit code 3 on the first docker compose up. Technically, running docker
compose up a second time will not throw the same error and the
application works fine. The new changes will improve the first time user
experience and improve on some other aspects.

Summary of Changes:
1- entrypoint.sh to not fail with exit code 3 on first docker up.
2- Improved error and signal handling with set -e.
3- Improved database migration, verification and error handling. Avoids
schema version mismatch and ensures db schema is always in sync with
application code.
4- Added exec before Gunicorn commands to ensure proper signal handling.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
	- Improved error handling for smoother database migrations and startup.
	- Updated process management to ensure reliable application launch.
- Optimized worker configuration and introduced a startup delay to
guarantee database readiness.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: soekja <soekja@users.noreply.github.com>
Co-authored-by: soekja <soekja@users.noreply.github.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-02-21 01:28:15 +01:00
alekszievr
28f92f661e
Test: Mock file download and open in musique adapter (#571)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Tests**
- Enhanced test coverage to improve adapter instantiation and data
loading reliability.
  - Updated mock testing logic to ensure robust content handling.
  - Removed an outdated test focused on data limit validation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-20 16:11:19 +01:00
alekszievr
97db017708
Test: test corpus builder [cog-1234] (#564)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Enhanced the continuous integration workflows with updated dependency
management and environment configurations for improved test stability.
  
- **Tests**
- Added parameterized unit tests to verify corpus loading and structure,
ensuring more robust handling of test data.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-20 15:16:58 +01:00
alekszievr
17231de5d0
Test: Parse context pieces separately in MusiqueQAAdapter and adjust tests [cog-1234] (#561)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Tests**
- Updated evaluation checks by removing assertions related to the
relationship between `corpus_list` and `qa_pairs`, now focusing solely
on `qa_pairs` limits.

- **Refactor**
- Improved content processing to append each paragraph individually to
`corpus_list`, enhancing clarity in data structure.
- Simplified type annotations in the `load_corpus` method across
multiple adapters, ensuring consistency in return types.

- **Chores**
- Updated dependency installation commands in GitHub Actions workflows
for Python 3.10, 3.11, and 3.12 to include additional evaluation-related
dependencies.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-02-20 14:23:53 +01:00
lxobr
e25c7c93fe
fix: correctly add nodes to chunks [COG-1370] (#568)
<!-- .github/pull_request_template.md -->

## Description
- Fix expand_with_nodes_and_edges to correctly add nodes to chunks
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Enhanced the internal processing for data associations to ensure more
reliable and consistent handling of connections.
- Streamlined the logic to better manage edge cases, improving overall
stability and error handling.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-20 12:52:34 +01:00
Igor Ilic
f2e0f47565
fix: test llm connection with gemini (#557)
<!-- .github/pull_request_template.md -->

## Description
Temporary fix for Gemini LLM until they allow empty dictionaries in
model schema definition

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- AI responses now adjust their format dynamically based on the type of
output, providing a streamlined text display when appropriate.
- Extended processing time improves the handling of longer operations
for a more reliable interaction.

- **Bug Fixes**
- Enhanced error management during connectivity tests ensures a more
robust and stable user experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-20 11:41:29 +01:00
Boris
45f7c63322
fix: notebooks errors (#565)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Automatically creates a blank graph when a file isn’t found, ensuring
smoother operations.
- Updated demonstration notebooks with dynamic configurations, including
refined search operations and input prompts.
- Introduced optional support for additional graph functionalities via
an integrated dependency.

- **Refactor**
- Streamlined processing by eliminating duplicate steps and simplifying
graph rendering workflows.

- **Chores**
- Updated environment configurations and upgraded the Python runtime for
improved performance and consistency.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-19 14:07:11 -08:00
Boris Arzentar
811e932cae version: v0.1.29 2025-02-19 20:19:51 +01:00
Boris
ada466879e
fix: add default params to run_tasks (#563)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced the task execution process by enabling default values for
certain parameters, allowing users to trigger task processing without
supplying every input explicitly.
  
- **Bug Fixes**
- Adjusted asynchronous handling for the `retrieved_edges_to_string`
function to ensure proper execution flow in various components.

- **Documentation**
- Updated markdown formatting in the Jupyter notebook for improved
readability and structure.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: hajdul88 <52442977+hajdul88@users.noreply.github.com>
2025-02-19 20:18:51 +01:00
alekszievr
e56d86b410
feat: Implement optional neo4j metrics and improve tests [cog-1262] (#556)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced graph analytics now offer detailed metrics—including shortest
path lengths, diameter, and clustering coefficients—to provide deeper
insights.
- Added new functions for creating connected test graphs and validating
metrics against predefined ground truth values.
- Introduced a new JSON file containing metrics for connected and
disconnected graph structures.

- **Improvements**
- Updated how graphs are projected to consistently use undirected
representations, ensuring more accurate and reliable metric
calculations.
- Streamlined metric consistency checks across different graph
processing methods for robust, reliable results.
- Simplified testing logic by consolidating metric assertions into a
single function call.

- **Chores**
- Removed unnecessary secret variables from the workflow configuration,
potentially affecting access to certain resources.
	- Updated secret management to include the new `OPENAI_API_KEY`.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-19 16:24:59 +01:00
alekszievr
2a167fa1ab
feat: externalize chunkers [cog-1354] (#547)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced document chunk extraction for improved processing consistency
across multiple formats.

- **Refactor**
- Streamlined the configuration for text chunking by replacing indirect
mappings with a direct instantiation approach across document types.
- Updated method signatures across various document classes to accept
chunker class references instead of string identifiers.

- **Chores**
- Removed legacy configuration utilities related to document chunking to
simplify processing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-19 13:26:11 +01:00
Hande
b10aef7b25
fix: update cognee logo on readme (#559)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Updated the project's logo image in the documentation to a transparent
version for a cleaner, more modern visual presentation.
  - Increased the logo height for improved visibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-19 11:52:50 +01:00
Vasilije
611b048020
feat: add auto dependency updater workflow (#548)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Introduced an automated process that regularly updates project
dependencies, enhancing stability and ensuring the app remains secure
and up-to-date.
- Removed an outdated workflow for profiling Python scripts,
streamlining the CI/CD process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-19 11:33:20 +01:00
Igor Ilic
424bd2127a
Cognee gui (#554)
<!-- .github/pull_request_template.md -->

## Description
Change cognee import to be part of try except clause

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-02-19 03:06:50 +01:00
hajdul88
0bcaf5c477
Feature/cog 1358 local ollama model support for cognee (#555)
<!-- .github/pull_request_template.md -->

This PR contains the ollama specific llm adapter together with the
embedding engine.

Tested with the following models:

`LLM_API_KEY="ollama"
llm_model = "llama3.1:8b"
LLM_PROVIDER = "ollama"
llm_endpoint = "http://localhost:11434/v1"
EMBEDDING_PROVIDER="ollama"
EMBEDDING_MODEL="avr/sfr-embedding-mistral:latest"
EMBEDDING_ENDPOINT="http://localhost:11434/api/embeddings"
EMBEDDING_DIMENSIONS=4096
HUGGINGFACE_TOKENIZER="Salesforce/SFR-Embedding-Mistral"`

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced a new embedding option that leverages an external provider
for asynchronous text processing.
- Added enhanced language model integration using a dedicated adapter to
improve interaction quality.

- **Enhancements**
  - Expanded configuration settings to include a new tokenizer option.
- Updated provider selection logic to incorporate the additional
embedding and language model features.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: vasilije <vas.markovic@gmail.com>
2025-02-19 02:54:04 +01:00
Vasilije
e98d51aac9
Add musique adapter base (#525)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
- Improved data handling by updating the dataset file path and ensuring
answers are consistently converted to lowercase for reliable processing.
  
- **Tests**
- Introduced unit tests to validate that data adapters instantiate
correctly, return non-empty content, and respect specified limits.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-02-18 19:48:22 +01:00
alekszievr
4efdb29187
Summarize retrieved edges to compact string [COG-1181] (#522)
<!-- .github/pull_request_template.md -->

## Description
Summarize retrieved edges to compact string with no redundancies.
Example:
**Before summarization:**


CV example:

visual innovations -- employs -- visual innovations
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- creativeworks agency
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- visual innovations
---

CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography
 -- contains -- rhode island school of design
---
Experienced Graphic Designer with over 8 years in visual design and
branding, specializing in Adobe Creative Suite and enthusiastic about
producing engaging visuals. -- made_from --
CV 4: Not Relevant
Name: David Thompson
Contact Information:

Email: david.thompson@example.com
Phone: (555) 456-7890
Summary:

Creative Graphic Designer with over 8 years of experience in visual
design and branding. Proficient in Adobe Creative Suite and passionate
about creating compelling visuals.

Education:

B.F.A. in Graphic Design, Rhode Island School of Design (2012)
Experience:

Senior Graphic Designer, CreativeWorks Agency (2015 – Present)
Led design projects for clients in various industries.
Created branding materials that increased client engagement by 30%.
Graphic Designer, Visual Innovations (2012 – 2015)
Designed marketing collateral, including brochures, logos, and websites.
Collaborated with the marketing team to develop cohesive brand
strategies.
Skills:

Design Software: Adobe Photoshop, Illustrator, InDesign
Web Design: HTML, CSS
Specialties: Branding and Identity, Typography

**After summarization:**

David Thompson is a Creative Graphic Designer with over 8 years of
experience in visual design and branding, proficient in Adobe Creative
Suite and passionate about creating compelling visuals. He holds a
B.F.A. in Graphic Design from the Rhode Island School of Design (2012).
His experience includes working as a Senior Graphic Designer at
CreativeWorks Agency (2015 – Present), where he led design projects and
created branding materials that increased client engagement by 30%, and
as a Graphic Designer at Visual Innovations (2012 – 2015), where he
designed marketing collateral and collaborated with the marketing team
to develop cohesive brand strategies. His skills include design software
such as Adobe Photoshop, Illustrator, and InDesign, as well as web
design in HTML and CSS, with specialties in Branding and Identity and
Typography.

1. David Thompson employs his skills in visual design and branding.
2. David Thompson contains experience from CreativeWorks Agency.
3. David Thompson contains experience from Visual Innovations.
4. David Thompson made his qualifications from the Rhode Island School
of Design.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a summarization engine that converts relationship-based
inputs into concise, natural sentences.
- Expanded search capabilities with a new query option that generates
graph summaries, providing insightful and aggregated results from graph
data.
- Enhanced asynchronous processing for improved performance in handling
graph data queries and summarization.
- Added flexibility in specifying string conversion methods for graph
edge retrieval.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-18 17:29:55 +01:00
Vasilije
fcb9298d4f
Update README.md 2025-02-18 03:10:17 +01:00
Vasilije
c09ff790cd
Update README.md (#550)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-02-18 03:09:42 +01:00
Vasilije
1b766522aa
Update README.md 2025-02-18 03:09:12 +01:00
Vasilije
af8bceeb2b
Update README.md (#549)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
2025-02-18 03:06:25 +01:00
Vasilije
82683cac0e
Update README.md 2025-02-18 03:05:38 +01:00
Vasilije
f77596817e
Update README.md 2025-02-18 03:04:46 +01:00
Vasilije
bbde039fed
Update README.md 2025-02-18 03:03:47 +01:00
Vasilije
b03eec20b7
Update README.md 2025-02-18 02:57:28 +01:00
Igor Ilic
09b9255639
GraphRAG vs RAG notebook (#503)
<!-- .github/pull_request_template.md -->

## Description
GraphRAG vs RAG cognee notebook

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Tests**
- Implemented automated validations to continuously monitor and ensure
the reliability of our interactive notebook features. These improvements
enhance overall stability and performance, enabling a more consistent
and dependable user experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Boris <boris@topoteretes.com>
2025-02-18 01:21:14 +01:00
Ikko Eltociear Ashimine
e4fd6a1ccd
docs: update README.md (#542)
<!-- .github/pull_request_template.md -->

## Description
bellow -> below

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
- Updated README text to correct a typographical error and improve
clarity regarding the demo notebook and YouTube video.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-15 19:52:29 +01:00
Vasilije
978f2b640e
Update README.md 2025-02-15 05:40:55 +01:00
Vasilije
2072c7a081
feat: improve tests add macos runners (#540)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved automated testing setups to run across multiple operating
systems (Ubuntu and macOS) for Python 3.10, 3.11, and 3.12.
- Enhanced compatibility and coverage across diverse environments,
ensuring a more robust validation process.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: soekja <wes.hubert@gmail.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-02-15 04:19:19 +01:00
SJ
d05b49863c
Creation of default user to have is_superuser=True by default (#539)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

God mode turned on by default for the default user creation.
is_superuser=True in create_default_user.py

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- The default user is now created with elevated (superuser) privileges,
which may affect access control and permissions.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-15 03:09:40 +01:00
Igor Ilic
ecf6a19ab8
Cognee gui add visualziation[COG-1334] (#538)
<!-- .github/pull_request_template.md -->

## Description
Add visualization to Cognee GUI

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced an interactive data visualization option that opens
insights in your web browser.
- Enhanced the interface layout for a more streamlined file upload and
visualization experience.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-15 03:08:26 +01:00
Daniel Molnar
b9869b1241
Following poetry changes. (#526)
<!-- .github/pull_request_template.md -->

## Description
Following poetry changes to have a working install script.

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
- Updated installation instructions to streamline the setup process
using Poetry.
- Now guides users through environment configuration, dependency
installation, and activation of the project environment.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Daniel Molnar <soobrosa@Daniels-MacBook-Pro.local>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-02-15 03:05:51 +01:00
Igor Ilic
ea88beb687
feat: Force .env file settings over real environment variable values [COG-1333] (#537)
<!-- .github/pull_request_template.md -->

## Description
Force .env file settings over real environment variable values

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Refined configuration handling so that settings from your
configuration file reliably take precedence over existing values,
ensuring a smoother and more predictable experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-14 20:40:21 +01:00
Igor Ilic
46e026f77f
Cognee gui [COG-1307] (#530)
<!-- .github/pull_request_template.md -->

## Description
Add a simple GUI to add documents to Cognee and use GRAPH_COMPLETION
search to get answers

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced an interactive file search interface with intuitive
controls. Users can easily upload files, enter search terms, and view
results in a unified display with clear notifications during processing.
  
- **Chores**
- Updated project dependencies to include `pyside6` and `qasync` for
enhanced GUI functionality.
- Refined background query processing to improve the accuracy and
relevance of search outcomes.
- Improved code readability with formatting enhancements in the search
function.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-14 15:51:33 +01:00
SJ
a602094598
feat: Update parameters in search API route to match search function parameters order (#528)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->
**Updated handling of SearchType through the chain:**
Router receives JSON with searchType Enum, example: "searchType":
"CHUNKS"
FastAPI converts to SearchType enum via SearchPayloadDTO
search_v2.py expects SearchType enum
search.py takes SearchType enum and extracts value
log_query.py takes string value
Query model stores string in database

**get_search_router.py**

Matched the exact field name from JSON payload searchType instead of
search_type in the SearchPayloadDTO class.
Changed cognee_search() params to use payload.query and
payload.searchType

**search.py**
Changed query_type to SearchType
log_query to accept query_type.value parameter instead of
str(query_type)

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Updated the search functionality to improve consistency and
reliability.
- Enhanced validation by switching to stricter search type checks,
ensuring only valid search types are processed.
- Maintained robust error handling for uninterrupted search operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
Co-authored-by: Boris <boris@topoteretes.com>
2025-02-13 21:04:31 +01:00
Boris Arzentar
20cbcbf52b fix: ruff error 2025-02-13 14:41:11 +01:00
Boris Arzentar
67c8edb853 version: v0.1.28 2025-02-13 13:17:00 +01:00
Boris Arzentar
d0d8559453 fix: consolidate api/sdk/mcp search 2025-02-13 13:15:39 +01:00
Boris Arzentar
fd9101af34 Merge remote-tracking branch 'origin/dev' 2025-02-13 00:02:02 +01:00
Boris Arzentar
e7b08def82 version: v0.1.27 2025-02-13 00:00:23 +01:00
Boris
f9e6dcf837
fix: simplify code pipeline (#529)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
  - Enhanced code search and dependency analysis for improved accuracy.
  - Introduced a new high-performance text embedding option.
  - Added an additional execution entry point for code graph processing.
- New optional parameters for flexible property selection in retrieval
functions.
- Introduced new classes for handling import statements, function
definitions, and class definitions.
  - Updated embedding engine selection based on configuration options.

- **Bug Fixes**
- Improved error handling in search operations and database queries for
a more stable user experience.
  - Enhanced error logging for source code parsing.

- **Refactor**
- Streamlined asynchronous processing and refactored internal dependency
extraction.
- Updated configuration and integration settings to enhance overall
reliability.
  - Restructured functions for simplified dependency handling.

- **Chores**
- Upgraded and reorganized dependency management with optional libraries
for extended functionality.
- Added new secret parameters for embedding configuration in workflow
settings.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: vasilije <vas.markovic@gmail.com>
2025-02-12 23:58:48 +01:00
lxobr
bb8cb692e0
Cog 1293 corpus builder custom cognify tasks (#527)
<!-- .github/pull_request_template.md -->

## Description
- Enable custom tasks in corpus building
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a configurable option to specify the task retrieval
strategy during corpus building.
- Enhanced the workflow with integrated task fetching, featuring a
default retrieval mechanism.
- Updated evaluation configuration to support customizable task
selection for more flexible operations.
- Added a new abstract base class for defining various task retrieval
strategies.
- Introduced a new enumeration to map task getter types to their
corresponding classes.
  
- **Dependencies**
  - Added a new dependency for downloading files from Google Drive.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-12 16:44:08 +01:00
vasilije
e6db870264 Add musique adapter base 2025-02-11 17:16:48 -05:00
Vasilije
9ba2e0d6c1
chore: Fix and update visualization (#518)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced enhanced visualization capabilities that let users launch a
dedicated server for visual displays.
  
- **Documentation**
- Updated several interactive notebooks to include execution outputs and
expanded explanatory content for better user guidance.
  
- **Style**
- Refined formatting and layout across notebooks to ensure consistent
presentation and improved readability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-02-11 19:25:01 +01:00
hajdul88
1b630366c9
Adds types property to pydantic Datapoint inherited classes (#523)
<!-- .github/pull_request_template.md -->

## Description
This PR adds types to DataPoint pydantic class + fixes visualization
colors

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added a `type` field to the `DataPoint` model for clearer data
classification.
- Enhanced color mapping in visualizations by assigning a distinct color
to "TextSummary" nodes.

- **Refactor**
- Improved default settings for version control and ordering to ensure
consistent data behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-02-11 19:23:19 +01:00