Commit graph

21 commits

Author SHA1 Message Date
vasilije
e445dd7f8b ugly fix for the scraper 2025-10-12 10:44:08 +02:00
Geoff-Robin
d91ffa2ad6 Removed staticmethod decorator from bs4_crawler.py, kwargs from the function signature in save_data_item_to_storage.py, removed unused imports in ingest_data.py and added robots_cache_ttl as a config field in BeautifulSoupCrawler. 2025-10-07 20:56:23 +05:30
Geoff-Robin
1b5c099f8b CodeRabbit reviews solved 2025-10-06 17:15:25 +05:30
Geoff-Robin
2cba31a086 Tested and Debugged scraping usage in cognee.add() pipeline 2025-10-04 21:26:25 +05:30
Geoff-Robin
da7ebc4574 Removed asyncio import 2025-10-04 15:10:46 +05:30
Geoff-Robin
fbef6675bc removed unused Dict import from typing 2025-10-04 15:10:05 +05:30
Geoff-Robin
20fb77316c Done with integration with add workflow when incremental_loading is set to False 2025-10-04 15:01:13 +05:30
Boris Arzentar
236a84dd65
version: v0.3.4.dev3 2025-09-18 18:08:22 +02:00
Igor Ilic
e975cde3e7 fix: Resolve issue with path being too long when trying to determine if input is a relative path 2025-09-10 13:34:16 +02:00
Igor Ilic
15bedfc1a7 fix: Resolve issue with relative path on cognee add 2025-09-10 13:01:00 +02:00
Boris
c5bd6bed40
fix: s3 file storage (#1095)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-07-16 20:36:18 +02:00
Boris
46c4463cb2
feat: s3 storage (#988)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.

---------

Co-authored-by: vasilije <vas.markovic@gmail.com>
Co-authored-by: Vasilije <8619304+Vasilije1990@users.noreply.github.com>
2025-07-14 21:47:08 +02:00
Igor Ilic
bcd418151a
fix: Secure api v2 (#1060)
<!-- .github/pull_request_template.md -->

## Description
<!-- Provide a clear description of the changes in this PR -->

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-07-11 16:10:02 +02:00
Igor Ilic
cb45897d7d
Secure api v2 (#1050)
<!-- .github/pull_request_template.md -->

## Description
Modify endpoints to allow better security for different infrastructure
needs and setups

## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-07-07 20:41:43 +02:00
hajdul88
0121a2b5fc
feature: Adds S3 functionality (#731)
<!-- .github/pull_request_template.md -->

## Description
Adds S3 support


## DCO Affirmation
I affirm that all code in every commit of this pull request conforms to
the terms of the Topoteretes Developer Certificate of Origin.
2025-04-17 08:56:40 +02:00
Igor Ilic
0c7c1d7503 refactor: Refactor ingestion to only have one ingestion task 2025-01-20 14:33:47 +01:00
vasilije
60c8fd103b ruff format 2025-01-05 19:09:08 +01:00
Igor Ilic
0ce254b262 feat: Add text deduplication
If text is added to cognee it will be saved by hash so the same text can't be stored multiple times

Feature COG-505
2024-12-04 17:19:29 +01:00
Igor Ilic
eb09e5ad89 refactor: Moved ingestion exceptions to ingestion module
Moved custom ingestion exceptions to ingestion module

Refactor COG-502
2024-11-29 17:15:54 +01:00
Igor Ilic
ae568409a7 feat: Add custom exceptions to cognee lib
Added use of custom exceptions to cognee lib
2024-11-27 14:29:33 +01:00
Igor Ilic
d30adb53f3
Cog 337 llama index support (#186)
* feat: Add support for LlamaIndex Document type

Added support for LlamaIndex Document type

Feature #COG-337

* docs: Add Jupyer Notebook for cognee with llama index document type

Added jupyter notebook which demonstrates cognee with LlamaIndex document type usage

Docs #COG-337

* feat: Add metadata migration from LlamaIndex document type

Allow usage of metadata from LlamaIndex documents

Feature #COG-337

* refactor: Change llama index migration function name

Change name of llama index function

Refactor #COG-337

* chore: Add llama index core dependency

Downgrade needed on tenacity and instructor modules to support llama index

Chore #COG-337

* Feature: Add ingest_data_with_metadata task

Added task that will have access to metadata if data is provided from different data ingestion tools

Feature #COG-337

* docs: Add description on why specific type checking is done

Explained why specific type checking is used instead of isinstance, as isinstace returns True for child classes as well

Docs #COG-337

* fix: Add missing parameter to function call

Added missing parameter to function call

Fix #COG-337

* refactor: Move storing of data from async to sync function

Moved data storing from async to sync

Refactor #COG-337

* refactor: Pretend ingest_data was changes instead of having two tasks

Refactor so ingest_data file was modified instead of having two ingest tasks

Refactor #COG-337

* refactor: Use old name for data ingestion with metadata

Merged new and old data ingestion tasks into one

Refactor #COG-337

* refactor: Return ingest_data and save_data_to_storage Tasks

Returned ingest_data and save_data_to_storage tasks

Refactor #COG-337

* refactor: Return previous ingestion Tasks to add function

Returned previous ignestion tasks to add function

Refactor #COG-337

* fix: Remove dict and use string for search query

Remove dictionary and use string for query in notebook and simple example

Fix COG-337

* refactor: Add changes request in pull request

Added the following changes that were requested in pull request:

Added synchronize label,
Made uniform syntax in if statement in workflow,
fixed instructor dependency,
added llama-index to be optional

Refactor COG-337

* fix: Resolve issue with llama-index being mandatory

Resolve issue with llama-index being mandatory to run cognee

Fix COG-337

* fix: Add install of llama-index to notebook

Removed additional references to llama-index from core cognee lib.
Added llama-index-core install from notebook

Fix COG-337

---------
2024-11-17 11:47:08 +01:00