Commit graph

5253 commits

Author SHA1 Message Date
GGrassia
5074b4e8ad fix (tracker_): removed passing as named arg 2025-11-28 15:54:38 +01:00
GGrassia
d971e372c9 fix (token_tracker): prevented double token tracker 2025-11-28 15:49:04 +01:00
GGrassia
49ce064a11 fix (embedding): fixed query endpoint 2025-11-28 15:38:33 +01:00
GGrassia
3a2d3ddb9f feat (token_tracking): added tracking token to both query and insert endpoints --and consequently pipeline 2025-11-26 17:00:04 +01:00
GGrassia
cd664de057 docs (metadata): Added Metadata_Filtering.md with examples and explanations for the functionality 2025-10-31 10:35:54 +01:00
GGrassia
166bdf7f99 fix (document_queue): fixed silent fail when requeueing 2025-10-30 15:15:34 +01:00
GGrassia
bdb1ae0786 feat (metadata postgres): added logic for IN clauses on operands 2025-10-10 11:05:17 +02:00
GGrassia
bb4d8181d5 Merge remote-tracking branch 'upstream/main' 2025-10-09 12:33:11 +02:00
GGrassia
a57d4ec0cc perf (metadata): optimized metadata query with gin indexes for postgres 2025-10-09 10:56:34 +02:00
yangdx
577b9e6882 Add project intelligence files for AI agent collaboration
- Add .clinerules with technical patterns
- Create Agments.md for Codex agent guidance
- Ensures consistent behavior across all team members
2025-10-09 16:35:38 +08:00
Daniel.y
0f15fdc3e2
Merge pull request #2181 from yrangana/feat/openai-embedding-token-tracking
feat: Add token tracking support to openai_embed function
2025-10-09 12:15:29 +08:00
Yasiru Rangana
ae9f4ae73f fix: Remove trailing whitespace for pre-commit linting 2025-10-09 15:01:53 +11:00
GGrassia
f4c2823c82 fix (postgres query): fixed metadata filtering for postgres pg queries 2025-10-08 18:23:01 +02:00
GGrassia
177ec23821 fix (metadata): added metadata as named parameter 2025-10-08 16:42:32 +02:00
GGrassia
c35f74b44c fix (operate): commented duplicate function 2025-10-08 16:33:03 +02:00
Yasiru Rangana
ec40b17eea feat: Add token tracking support to openai_embed function
- Add optional token_tracker parameter to openai_embed()
- Track prompt_tokens and total_tokens for embedding API calls
- Enables monitoring of embedding token usage alongside LLM calls
- Maintains backward compatibility with existing code
2025-10-08 14:36:08 +11:00
yangdx
f1e0110716 Merge branch 'kevinnkansah/main' 2025-10-07 23:04:59 +08:00
yangdx
f2c0b41e78 Make PostgreSQL statement_cache_size configuration optional
• Remove forced int conversion
• Allow None values for cache size
• Add conditional parameter setting
2025-10-07 22:57:21 +08:00
Daniel.y
ea5e390bb4
Merge pull request #2178 from aleksvujic/patch-1
Fixed typo in log message when creating new graph file
2025-10-07 21:54:02 +08:00
Aleks Vujić
dd8f44e621
Fixed typo in log message when creating new graph file 2025-10-07 14:30:05 +02:00
GGrassia
4535af4e2a Merge remote-tracking branch 'upstream/main' 2025-10-07 12:23:40 +02:00
GGrassia
b5cc842708 feat (metadata) WIP: added metadata GIN index and modified sql queries for metadata filtering to optimize speed 2025-10-07 12:14:48 +02:00
GGrassia
e38387964f feat (metadata): added IN clause management 2025-10-07 11:24:59 +02:00
kevinnkansah
fdcb034da0 chore: distinguish settings 2025-10-06 12:01:40 +02:00
kevinnkansah
22a7b482c5 fix: renamed PostGreSQL options env variable and allowed LRU cache to be an optional env variable 2025-10-06 11:56:09 +02:00
kevinnkansah
d8a9617c0e fix: fix: asyncpg bouncer connection pool error
Prepared statement caching is disabled by setting
`statement_cache_size=0` in the `asyncpg` connection pool parameters.
This is necessary to prevent
`asyncpg.exceptions.InvalidSQLStatementNameError` when using
transaction-level connection poolers like Supabase Supavisor or
pgbouncer, which do not support prepared statements.
2025-10-06 00:36:25 +02:00
kevinnkansah
108cdbe133 feat: add options for PostGres connection 2025-10-05 23:29:04 +02:00
yangdx
6190fa8985 Fix linting 2025-10-06 04:57:11 +08:00
yangdx
91387628ff Add test script for aquery_data endpoint validation 2025-10-06 03:59:50 +08:00
yangdx
4fe41f76f2 Merge branch 'doc-name-in-full-docs' 2025-10-05 14:03:02 +08:00
yangdx
d473f635d8 Update webui assets 2025-10-05 14:02:18 +08:00
yangdx
a31192dd5a Update i18n file for pipeline UI text across locales 2025-10-05 14:01:22 +08:00
yangdx
aac787bafb Clarify chunk tracking log message in _build_llm_context 2025-10-05 13:33:55 +08:00
Daniel.y
1b274706d8
Merge pull request #2171 from danielaskdd/doc-name-in-full-docs
Fix: Add file_path field to full_docs storage
2025-10-05 13:03:14 +08:00
yangdx
457d51952e Add doc_name field to full docs storage
- Store file_path in full_docs storage
- Update PostgreSQL implementation by map file_path to doc_name
- Other storage implementation automatically handles the new field
2025-10-05 11:44:27 +08:00
yangdx
d550f1c58c Fix linting 2025-10-05 10:42:15 +08:00
yangdx
1574fec7f0 Update webui assets 2025-10-05 10:41:46 +08:00
yangdx
0aef6a16b8 Add theme-aware edge highlighting colors for graph control 2025-10-05 10:40:25 +08:00
Daniel.y
dad90d25be
Merge pull request #2170 from danielaskdd/tooltips-optimize
Refactor(webui): Improve document tooltip display with track ID and better formatting
2025-10-05 10:32:24 +08:00
yangdx
0c1cb7b731 Improve document tooltip display with track ID and better formatting
• Add track ID to tooltip display
• Remove JSON braces from metadata
• Reorder tooltip content layout
• Clean up metadata indentation
• Show track ID before metadata
2025-10-05 10:13:11 +08:00
yangdx
b5f8376756 Update webui assets 2025-10-05 09:27:38 +08:00
yangdx
dde728a32f Bump core version to 1.5.0 and API to 0236 2025-10-05 09:25:57 +08:00
yangdx
0d694962ff Merge branch 'feat/retry-failed-documents-upstream' 2025-10-05 09:24:40 +08:00
yangdx
7b1f8e0f6f Update scan tooltip to clarify it also reprocesses failed documents 2025-10-05 09:23:56 +08:00
yangdx
bf6ca9dd97 Add retry failed button translations and standardize button text
- Add missing AR/FR/TW translations
- Shorten EN/ZH button text to "Retry"
2025-10-05 09:20:33 +08:00
Jon
cf2a024e37 feat: Add endpoint and UI to retry failed documents
Add a new `/documents/reprocess_failed` API endpoint and corresponding
UI button to retry processing of failed and pending documents. This
addresses a common recovery scenario when document processing fails due
to server crashes, network errors, or LLM service outages.

Backend changes:
- Add ReprocessResponse model with status, message, and track_id fields
- Add POST /documents/reprocess_failed endpoint that triggers background
  reprocessing of FAILED, PENDING, and interrupted PROCESSING documents
- Reuses existing apipeline_process_enqueue_documents for consistency
- Includes comprehensive docstring and logging for observability

Frontend changes:
- Add TypeScript types and API function for the new endpoint
- Add retry handler with intelligent polling (fast refresh → normal)
- Add "Retry Failed" button in Documents page toolbar
- Button disabled when pipeline is busy to prevent duplicate operations
- Complete i18n support (English and Chinese translations)

This feature provides a convenient way to recover from processing
failures without requiring a full filesystem rescan.
2025-10-04 16:46:29 -04:00
yangdx
b9c37bd937 Fix linting 2025-10-03 02:10:02 +08:00
yangdx
112349ed5b Modernize type hints and remove Python 3.8 compatibility code
• Use collections.abc.AsyncIterator only
• Remove sys.version_info checks
• Use union syntax for None types
• Simplify string emptiness checks
2025-10-02 23:15:42 +08:00
yangdx
cec784f60e Update webui assets 2025-10-02 22:02:42 +08:00
yangdx
181525ffc2 Merge branch 'main' into zl7261/main 2025-10-02 22:01:16 +08:00