alekszievr
edae2771a5
Count the number of tokens in documents [COG-1071] ( #476 )
...
* Count the number of tokens in documents
* save token count to relational db
---------
Co-authored-by: Igor Ilic <30923996+dexters1@users.noreply.github.com>
2025-01-29 11:29:09 +01:00
Igor Ilic
860218632f
refactor: add suggestions from PR
...
Add suggestsions made by CodeRabbit on pull request
2025-01-28 17:15:25 +01:00
Igor Ilic
a8644e0bd7
feat: Use litellm max token size as default for model, if model exists in litellm
2025-01-28 17:00:47 +01:00
Igor Ilic
710ca78d6e
Merge branch 'dev' into COG-970-refactor-tokenizing
2025-01-28 16:31:11 +01:00
alekszievr
98f0f60980
Feat: [cog-1089] Define pydantic models for descriptive graph metrics and input metrics ( #466 )
...
* feat: make tasks a configurable argument in the cognify function
* fix: add data points task
* Define pydantic models for descriptive graph metrics and input metrics
* remove to_json method
* Use just one MetricData class instead of two
---------
Co-authored-by: lxobr <122801072+lxobr@users.noreply.github.com>
2025-01-28 16:11:31 +01:00
Igor Ilic
6f8cbdbf1c
Merge branch 'dev' into COG-970-refactor-tokenizing
2025-01-28 15:44:57 +01:00
Igor Ilic
4e56cd64a1
refactor: Add max chunk tokens to code graph pipeline
2025-01-28 15:33:34 +01:00
Igor Ilic
dc0450d30e
test: Update document tests regrading max chunk tokens
2025-01-28 15:21:43 +01:00
Igor Ilic
41544369af
test: Change test_by_paragraph tests to accomodate to change
2025-01-28 14:47:17 +01:00
Igor Ilic
3db7f85c9c
feat: Add max_chunk_tokens value to chunkers
...
Add formula and forwarding of max_chunk_tokens value through Cognee
2025-01-28 14:32:00 +01:00
Igor Ilic
49f60971bb
Merge branch 'dev' into COG-970-refactor-tokenizing
2025-01-28 10:12:55 +01:00
Boris Arzentar
f811ab44e0
Merge remote-tracking branch 'origin/dev' into feat/COG-1060-code-pipeline-endpoints
2025-01-28 10:10:38 +01:00
Igor Ilic
0a9f1349f2
refactor: Change variable and function names based on PR comments
...
Change variable and function names based on PR comments
2025-01-28 10:10:29 +01:00
Boris Arzentar
3320bc8f2c
feat: add codegraph related API endpoints
2025-01-28 10:08:59 +01:00
Boris
8da81c1de3
Merge branch 'dev' into pgvector-add-normalization
2025-01-27 11:31:24 +01:00
Boris
0c2c5870df
fix: use low_lever server for cognee mcp server ( #470 )
...
* fix: revert to older mcp version
* fix: use low_level server for the mcp
* fix: styling errors
* fix: mcp cognify arguments
* fix: ruff errors
2025-01-26 12:52:48 +01:00
Igor Ilic
89d4b7a5c4
Merge branch 'dev' into pgvector-add-normalization
2025-01-24 19:24:39 +01:00
Igor Ilic
23ecf245ed
fix: Return string conversion to resolve traceback
2025-01-24 19:20:55 +01:00
Igor Ilic
b0cec3fcaa
refactor: Remove conversion to string
2025-01-24 19:03:57 +01:00
Igor Ilic
ffbb387580
Merge branch 'dev' into fix-insert-data
2025-01-24 18:55:41 +01:00
Igor Ilic
77a72851fc
Merge branch 'dev' into COG-970-refactor-tokenizing
2025-01-24 18:34:50 +01:00
Igor Ilic
cdc992750a
test: Add github action to test code graph
2025-01-24 18:12:16 +01:00
Igor Ilic
902979c1de
refactor: Refactor get source code chunks based on tokenizer rework
2025-01-24 13:40:10 +01:00
Igor Ilic
844d99cb72
docs: Remove commented code
2025-01-23 18:24:26 +01:00
Igor Ilic
7dea1d54d7
refactor: Add specific max token values to embedding models
2025-01-23 18:18:45 +01:00
Igor Ilic
6d5679f9d2
Merge branch 'dev' into COG-970-refactor-tokenizing
2025-01-23 18:14:49 +01:00
Igor Ilic
1319944dcd
docs: Update .env.template to include llm and embedding options
2025-01-23 18:05:45 +01:00
Igor Ilic
b686376c54
feat: Add gemini tokenizer to cognee
2025-01-23 17:55:04 +01:00
Igor Ilic
294ed1d960
feat: Add HuggingFace Tokenizer support
2025-01-23 16:52:35 +01:00
Igor Ilic
2e1a48e22c
docs: Add usage example of function
2025-01-23 15:13:46 +01:00
Igor Ilic
de19016494
fix: Add flag to allow SQLite to use foreign keys
2025-01-23 15:10:27 +01:00
Igor Ilic
d4453e4a1d
fix: Add support for SQLite and PostgreSQL for inserting data in SQLAlchemyAdapter
2025-01-23 14:59:02 +01:00
Boris Arzentar
e577276d91
Merge remote-tracking branch 'origin/dev' into feat/COG-1058-fastmcp
2025-01-23 11:46:25 +01:00
Boris Arzentar
00f302c37a
feat: use fastmcp for mcp server
2025-01-23 11:45:40 +01:00
Igor Ilic
9f6a0ba783
Merge branch 'dev' into pgvector-add-normalization
2025-01-23 11:11:43 +01:00
Igor Ilic
40c0279ec5
Merge branch 'COG-793-metadata-rework' of github.com:topoteretes/cognee into COG-793-metadata-rework
2025-01-22 16:13:11 +01:00
Igor Ilic
80e67b0619
refactor: Rename foreign to external metadata
...
Rename foreign metadata to external metadata for metadata coming outside of Cognee
2025-01-22 16:07:35 +01:00
Igor Ilic
93249c72c5
fix: Initial commit to resolve issue with using tokenizer based on LLMs
...
Currently TikToken is used for tokenizing by default which is only supported by OpenAI,
this is an initial commit in an attempt to add Cognee tokenizing support for multiple LLMs
2025-01-21 19:53:22 +01:00
Igor Ilic
655ab0b8cc
Merge branch 'dev' into COG-793-metadata-rework
2025-01-21 18:20:49 +01:00
Vasilije
c9536f97a5
Merge pull request #451 from topoteretes/add_docstrings
...
chore: add docstrings any typing to cognee tasks
2025-01-21 14:07:19 +01:00
Igor Ilic
bd3a5a758c
Merge branch 'dev' into COG-793-metadata-rework
2025-01-20 18:06:21 +01:00
Igor Ilic
4196a4ce89
refactor: Update test to be up to date with current metadata refactor effort
2025-01-20 17:53:54 +01:00
Igor Ilic
5c17501bb8
refactor: add missing foreing_metadata attr to tests
2025-01-20 17:38:28 +01:00
Igor Ilic
ab8d95cc30
refactor: As neo4j can't support dictionaries, add foreign metadata as string
2025-01-20 17:28:14 +01:00
Igor Ilic
49ad292592
refactor: Reduce complexity of metadata handling
...
Have foreign metadata be a table column in data instead of it's own table to reduce complexity
Refactor COG-793
2025-01-20 16:39:05 +01:00
Igor Ilic
0c7c1d7503
refactor: Refactor ingestion to only have one ingestion task
2025-01-20 14:33:47 +01:00
hajdul88
813a03c6e2
Merge branch 'dev' into pgvector-add-normalization
2025-01-20 13:46:50 +01:00
Igor Ilic
2546844787
feat: Add normalization to PGVector search
...
Add normalization to PGVector search results
2025-01-20 13:42:39 +01:00
hajdul88
bf70705ed0
Fix: fixes networkx failed to load graph from file error
2025-01-20 12:19:34 +01:00
Igor Ilic
e7f24548dd
Merge branch 'dev' into add_docstrings
2025-01-17 17:00:23 +01:00