<!-- .github/pull_request_template.md -->
## Description
Change version to latest llama index cognee integration version which
has a proper fix for the failing notebook
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Chores**
- Updated an AI integration dependency to version 0.1.3 in both the
testing workflow and the Jupyter notebook, ensuring that the environment
uses the latest version for improved consistency during tests.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enhanced pipeline execution now provides consolidated status feedback
with improved telemetry for start, completion, and error events.
- Automatic generation of unique dataset identifiers offers clearer task
and pipeline run associations.
- **Refactor**
- Task execution has been streamlined with explicit parameter handling
for more structured pipeline processing.
- Interactive examples and demos now return results directly, making
integration and monitoring more accessible.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Boris Arzentar <borisarzentar@gmail.com>
<!-- .github/pull_request_template.md -->
## Description
Notebook and python example for cognee simple example
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Introduced an interactive demo showcasing asynchronous document
processing and querying for key insights from a sample text.
- **Documentation**
- Added an in-depth, step-by-step guide in a Jupyter Notebook that walks
users through setup, configuration, querying, and visualizing processed
data.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
<!-- Provide a clear description of the changes in this PR -->
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
• Graph visualizations now allow exporting to a user-specified file path
for more flexible output management.
• The text embedding process has been enhanced with an additional
tokenizer option for improved performance.
• A new `ExtendableDataPoint` class has been introduced for future
extensions.
• New JSON files for companies and individuals have been added to
facilitate testing and data processing.
- **Improvements**
• Search functionality now uses updated identifiers for more reliable
content retrieval.
• Metadata handling has been streamlined across various classes by
removing unnecessary type specifications.
• Enhanced serialization of properties in the Neo4j adapter for improved
handling of complex structures.
• The setup process for databases has been improved with a new
asynchronous setup function.
- **Chores**
• Dependency and configuration updates improve overall stability and
performance.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- .github/pull_request_template.md -->
## Description
Refactor search so query type doesn't need to be provided to make it
simpler for new users
## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Refactor**
- Improved the search interface by standardizing parameter usage with
explicit keyword arguments for specifying search types, enhancing
clarity and consistency.
- **Tests**
- Updated test cases and example integrations to align with the revised
search parameters, ensuring consistent behavior and reliable validation
of search outcomes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Added support for multiple audio and image formats with example
The formats added are the possible filetype library return values for
extension for Audio and Images
Feature COG-507
By creating PGVector as a singleton all issues regrading timeout are
resolved as there are no more parallel instances trying to communicate
with the database
* feat: log search queries and results
* fix: address coderabbit review comments
* fix: parse UUID when logging search results
* fix: remove custom UUID type and use DB agnostic UUID from sqlalchemy
* Add new cognee_db
---------
Co-authored-by: Leon Luithlen <leon@topoteretes.com>
* feat: Add support for LlamaIndex Document type
Added support for LlamaIndex Document type
Feature #COG-337
* docs: Add Jupyer Notebook for cognee with llama index document type
Added jupyter notebook which demonstrates cognee with LlamaIndex document type usage
Docs #COG-337
* feat: Add metadata migration from LlamaIndex document type
Allow usage of metadata from LlamaIndex documents
Feature #COG-337
* refactor: Change llama index migration function name
Change name of llama index function
Refactor #COG-337
* chore: Add llama index core dependency
Downgrade needed on tenacity and instructor modules to support llama index
Chore #COG-337
* Feature: Add ingest_data_with_metadata task
Added task that will have access to metadata if data is provided from different data ingestion tools
Feature #COG-337
* docs: Add description on why specific type checking is done
Explained why specific type checking is used instead of isinstance, as isinstace returns True for child classes as well
Docs #COG-337
* fix: Add missing parameter to function call
Added missing parameter to function call
Fix #COG-337
* refactor: Move storing of data from async to sync function
Moved data storing from async to sync
Refactor #COG-337
* refactor: Pretend ingest_data was changes instead of having two tasks
Refactor so ingest_data file was modified instead of having two ingest tasks
Refactor #COG-337
* refactor: Use old name for data ingestion with metadata
Merged new and old data ingestion tasks into one
Refactor #COG-337
* refactor: Return ingest_data and save_data_to_storage Tasks
Returned ingest_data and save_data_to_storage tasks
Refactor #COG-337
* refactor: Return previous ingestion Tasks to add function
Returned previous ignestion tasks to add function
Refactor #COG-337
* fix: Remove dict and use string for search query
Remove dictionary and use string for query in notebook and simple example
Fix COG-337
* refactor: Add changes request in pull request
Added the following changes that were requested in pull request:
Added synchronize label,
Made uniform syntax in if statement in workflow,
fixed instructor dependency,
added llama-index to be optional
Refactor COG-337
* fix: Resolve issue with llama-index being mandatory
Resolve issue with llama-index being mandatory to run cognee
Fix COG-337
* fix: Add install of llama-index to notebook
Removed additional references to llama-index from core cognee lib.
Added llama-index-core install from notebook
Fix COG-337
---------
* fix: remove groups from UserRead model
* fix: add missing system dependencies for postgres
* fix: change vector db provider environment variable name
* fix: WeaviateAdapter retrieve bug
* fix: correctly return data point objects from retrieve method
* fix: align graph object properties
* feat: add node example