LightRAG/upload_pdfs.py
clssck da9070ecf7 refactor: remove legacy storage implementations and k8s deployment
Remove deprecated storage backends and Kubernetes deployment configuration:
- Delete unused storage implementations: FAISS, JSON, Memgraph, Milvus, MongoDB, Nano Vector DB, Neo4j, NetworkX, Qdrant, Redis
- Remove Kubernetes deployment manifests and installation scripts
- Delete legacy examples for deprecated backends
- Consolidate to PostgreSQL-only storage backend
Streamline dependencies and add new capabilities:
- Remove deprecated code documentation and migration guides
- Add full-text search caching layer with FTS cache module
- Implement metrics collection and monitoring pipeline
- Add explain and metrics API routes
- Simplify configuration with PostgreSQL-focused setup
Update documentation and configuration:
- Rewrite README to focus on supported features
- Update environment and configuration examples
- Remove Kubernetes-specific documentation
- Add new utility scripts for PDF uploads and pipeline monitoring
2025-12-09 14:02:00 +01:00

29 lines
977 B
Python

#!/usr/bin/env python3
"""Upload PDFs to LightRAG server."""
import os
import sys
from pathlib import Path
import requests
PDF_DIR = Path("documents/questions/docs/pdf")
API_URL = "http://localhost:9621/documents/upload"
def upload_pdfs():
pdf_files = list(PDF_DIR.glob("*.pdf"))
print(f"Found {len(pdf_files)} PDFs to upload")
for i, pdf_path in enumerate(pdf_files, 1):
print(f"[{i}/{len(pdf_files)}] Uploading: {pdf_path.name}")
try:
with open(pdf_path, 'rb') as f:
files = {'file': (pdf_path.name, f, 'application/pdf')}
response = requests.post(API_URL, files=files, timeout=120)
result = response.json()
print(f" -> {result.get('status', 'unknown')}: {result.get('message', 'No message')[:80]}")
except Exception as e:
print(f" -> ERROR: {e}")
print("\nDone! Checking processing status...")
if __name__ == "__main__":
upload_pdfs()