LightRAG/lightrag/llm
Michele Comitini bd94714b15 options needs to be passed to ollama client embed() method
Fix line length

Create binding_options.py

Remove test property

Add dynamic binding options to CLI and environment config

Automatically generate command-line arguments and environment variable
support for all LLM provider bindings using BindingOptions. Add sample
.env generation and extensible framework for new providers.

Add example option definitions and fix test arg check in OllamaOptions

Add options_dict method to BindingOptions for argument parsing

Add comprehensive Ollama binding configuration options

ruff formatting Apply ruff formatting to binding_options.py

Add Ollama separate options for embedding and LLM

Refactor Ollama binding options and fix class var handling

The changes improve how class variables are handled in binding options
and better organize the Ollama-specific options into LLM and embedding
subclasses.

Fix typo in arg test.

Rename cls parameter to klass to avoid keyword shadowing

Fix Ollama embedding binding name typo

Fix ollama embedder context param name

Split Ollama options into LLM and embedding configs with mixin base

Add Ollama option configuration to LLM and embeddings in lightrag_server

Update sample .env generation and environment handling

Conditionally add env vars and cmdline options only when ollama bindings
are used. Add example env file for Ollama binding options.
2025-07-28 12:05:40 +02:00
..
__init__.py Separated llms from the main llm.py file and fixed some deprication bugs 2025-01-25 00:11:00 +01:00
anthropic.py Update webui assets 2025-03-22 00:36:38 +08:00
azure_openai.py fix Azure deployment 2025-07-17 23:11:07 +02:00
bedrock.py Eliminate tenacity from dynamic import 2025-05-14 10:57:05 +08:00
binding_options.py options needs to be passed to ollama client embed() method 2025-07-28 12:05:40 +02:00
hf.py Eliminate tenacity from dynamic import 2025-05-14 10:57:05 +08:00
jina.py Fix linting 2025-07-24 12:25:50 +08:00
llama_index_impl.py Fix linting 2025-05-22 10:46:03 +08:00
lmdeploy.py Eliminate tenacity from dynamic import 2025-05-14 10:57:05 +08:00
lollms.py Remove tenacity from dynamic import 2025-05-14 11:30:48 +08:00
nvidia_openai.py clean comments and unused libs 2025-02-18 21:12:06 +01:00
ollama.py options needs to be passed to ollama client embed() method 2025-07-28 12:05:40 +02:00
openai.py Update openai.py 2025-07-15 17:30:30 +08:00
Readme.md Update LlamaIndex README: improve documentation and example paths 2025-02-20 10:33:15 +01:00
siliconcloud.py clean comments and unused libs 2025-02-18 21:12:06 +01:00
zhipu.py clean comments and unused libs 2025-02-18 21:12:06 +01:00

  1. LlamaIndex (llm/llama_index.py):
    • Provides integration with OpenAI and other providers through LlamaIndex
    • Supports both direct API access and proxy services like LiteLLM
    • Handles embeddings and completions with consistent interfaces
    • See example implementations:
Using LlamaIndex

LightRAG supports LlamaIndex for embeddings and completions in two ways: direct OpenAI usage or through LiteLLM proxy.

Setup

First, install the required dependencies:

pip install llama-index-llms-litellm llama-index-embeddings-litellm

Standard OpenAI Usage

from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from lightrag.utils import EmbeddingFunc

# Initialize with direct OpenAI access
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    try:
        # Initialize OpenAI if not in kwargs
        if 'llm_instance' not in kwargs:
            llm_instance = OpenAI(
                model="gpt-4",
                api_key="your-openai-key",
                temperature=0.7,
            )
            kwargs['llm_instance'] = llm_instance

        response = await llama_index_complete_if_cache(
            kwargs['llm_instance'],
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            **kwargs,
        )
        return response
    except Exception as e:
        logger.error(f"LLM request failed: {str(e)}")
        raise

# Initialize LightRAG with OpenAI
rag = LightRAG(
    working_dir="your/path",
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=1536,
        max_token_size=8192,
        func=lambda texts: llama_index_embed(
            texts,
            embed_model=OpenAIEmbedding(
                model="text-embedding-3-large",
                api_key="your-openai-key"
            )
        ),
    ),
)

Using LiteLLM Proxy

  1. Use any LLM provider through LiteLLM
  2. Leverage LlamaIndex's embedding and completion capabilities
  3. Maintain consistent configuration across services
from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.llms.litellm import LiteLLM
from llama_index.embeddings.litellm import LiteLLMEmbedding
from lightrag.utils import EmbeddingFunc

# Initialize with LiteLLM proxy
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    try:
        # Initialize LiteLLM if not in kwargs
        if 'llm_instance' not in kwargs:
            llm_instance = LiteLLM(
                model=f"openai/{settings.LLM_MODEL}",  # Format: "provider/model_name"
                api_base=settings.LITELLM_URL,
                api_key=settings.LITELLM_KEY,
                temperature=0.7,
            )
            kwargs['llm_instance'] = llm_instance

        response = await llama_index_complete_if_cache(
            kwargs['llm_instance'],
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            **kwargs,
        )
        return response
    except Exception as e:
        logger.error(f"LLM request failed: {str(e)}")
        raise

# Initialize LightRAG with LiteLLM
rag = LightRAG(
    working_dir="your/path",
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=1536,
        max_token_size=8192,
        func=lambda texts: llama_index_embed(
            texts,
            embed_model=LiteLLMEmbedding(
                model_name=f"openai/{settings.EMBEDDING_MODEL}",
                api_base=settings.LITELLM_URL,
                api_key=settings.LITELLM_KEY,
            )
        ),
    ),
)

Environment Variables

For OpenAI direct usage:

OPENAI_API_KEY=your-openai-key

For LiteLLM proxy:

# LiteLLM Configuration
LITELLM_URL=http://litellm:4000
LITELLM_KEY=your-litellm-key

# Model Configuration
LLM_MODEL=gpt-4
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_MAX_TOKEN_SIZE=8192

Key Differences

  1. Direct OpenAI:

    • Simpler setup
    • Direct API access
    • Requires OpenAI API key
  2. LiteLLM Proxy:

    • Model provider agnostic
    • Centralized API key management
    • Support for multiple providers
    • Better cost control and monitoring