LightRAG

History

yangdx ecd7777e61 Update OpenAI embedding handling for both list and base64 embeddings - Fix OpenAI embedding array parsing - Improve embedding data type safety		2025-08-09 08:40:33 +08:00
..
__init__.py	Separated llms from the main llm.py file and fixed some deprication bugs	2025-01-25 00:11:00 +01:00
anthropic.py	Update webui assets	2025-03-22 00:36:38 +08:00
azure_openai.py	refactor: improve JSON parsing reliability with json-repair library	2025-08-01 19:36:20 +08:00
bedrock.py	refactor: improve JSON parsing reliability with json-repair library	2025-08-01 19:36:20 +08:00
binding_options.py	feat: Add OpenAI LLM Options support with BindingOptions framework	2025-08-05 03:47:26 +08:00
hf.py	refactor: improve JSON parsing reliability with json-repair library	2025-08-01 19:36:20 +08:00
jina.py	feat: improve Jina API error handling to show clean messages instead of HTML	2025-08-05 11:46:02 +08:00
llama_index_impl.py	refactor: improve JSON parsing reliability with json-repair library	2025-08-01 19:36:20 +08:00
lmdeploy.py	Eliminate tenacity from dynamic import	2025-05-14 10:57:05 +08:00
lollms.py	Set the default LLM temperature to 1.0 and centralize constant management	2025-07-31 17:15:10 +08:00
nvidia_openai.py	refactor: Remove deprecated `max_token_size` from embedding configuration	2025-07-29 10:49:35 +08:00
ollama.py	fix timeout issue	2025-07-29 13:38:46 +07:00
openai.py	Update OpenAI embedding handling for both list and base64 embeddings	2025-08-09 08:40:33 +08:00
Readme.md	refactor: Remove deprecated `max_token_size` from embedding configuration	2025-07-29 10:49:35 +08:00
siliconcloud.py	refactor: Remove deprecated `max_token_size` from embedding configuration	2025-07-29 10:49:35 +08:00
zhipu.py	refactor: Remove deprecated `max_token_size` from embedding configuration	2025-07-29 10:49:35 +08:00

Readme.md

LlamaIndex (llm/llama_index.py):
- Provides integration with OpenAI and other providers through LlamaIndex
- Supports both direct API access and proxy services like LiteLLM
- Handles embeddings and completions with consistent interfaces
- See example implementations:
  - Direct OpenAI Usage
  - LiteLLM Proxy Usage

Using LlamaIndex

LightRAG supports LlamaIndex for embeddings and completions in two ways: direct OpenAI usage or through LiteLLM proxy.

Setup

First, install the required dependencies:

pip install llama-index-llms-litellm llama-index-embeddings-litellm

Standard OpenAI Usage

from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from lightrag.utils import EmbeddingFunc

# Initialize with direct OpenAI access
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    try:
        # Initialize OpenAI if not in kwargs
        if 'llm_instance' not in kwargs:
            llm_instance = OpenAI(
                model="gpt-4",
                api_key="your-openai-key",
                temperature=0.7,
            )
            kwargs['llm_instance'] = llm_instance

        response = await llama_index_complete_if_cache(
            kwargs['llm_instance'],
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            **kwargs,
        )
        return response
    except Exception as e:
        logger.error(f"LLM request failed: {str(e)}")
        raise

# Initialize LightRAG with OpenAI
rag = LightRAG(
    working_dir="your/path",
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=1536,
        func=lambda texts: llama_index_embed(
            texts,
            embed_model=OpenAIEmbedding(
                model="text-embedding-3-large",
                api_key="your-openai-key"
            )
        ),
    ),
)

Using LiteLLM Proxy

Use any LLM provider through LiteLLM
Leverage LlamaIndex's embedding and completion capabilities
Maintain consistent configuration across services

from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.llms.litellm import LiteLLM
from llama_index.embeddings.litellm import LiteLLMEmbedding
from lightrag.utils import EmbeddingFunc

# Initialize with LiteLLM proxy
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    try:
        # Initialize LiteLLM if not in kwargs
        if 'llm_instance' not in kwargs:
            llm_instance = LiteLLM(
                model=f"openai/{settings.LLM_MODEL}",  # Format: "provider/model_name"
                api_base=settings.LITELLM_URL,
                api_key=settings.LITELLM_KEY,
                temperature=0.7,
            )
            kwargs['llm_instance'] = llm_instance

        response = await llama_index_complete_if_cache(
            kwargs['llm_instance'],
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            **kwargs,
        )
        return response
    except Exception as e:
        logger.error(f"LLM request failed: {str(e)}")
        raise

# Initialize LightRAG with LiteLLM
rag = LightRAG(
    working_dir="your/path",
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=1536,
        func=lambda texts: llama_index_embed(
            texts,
            embed_model=LiteLLMEmbedding(
                model_name=f"openai/{settings.EMBEDDING_MODEL}",
                api_base=settings.LITELLM_URL,
                api_key=settings.LITELLM_KEY,
            )
        ),
    ),
)

Environment Variables

For OpenAI direct usage:

OPENAI_API_KEY=your-openai-key

For LiteLLM proxy:

# LiteLLM Configuration
LITELLM_URL=http://litellm:4000
LITELLM_KEY=your-litellm-key

# Model Configuration
LLM_MODEL=gpt-4
EMBEDDING_MODEL=text-embedding-3-large

Key Differences

Direct OpenAI:
- Simpler setup
- Direct API access
- Requires OpenAI API key
LiteLLM Proxy:
- Model provider agnostic
- Centralized API key management
- Support for multiple providers
- Better cost control and monitoring