LightRAG

History

Arjun Rao b7eae4d7c0 Use the context manager for the openai client This avoids issues of resource cleanup (too many open files) when dealing with massively parallel calls to the openai API since RAII in python is highly unreliable in such contexts.		2025-05-08 11:42:53 +10:00
..
__init__.py	Separated llms from the main llm.py file and fixed some deprication bugs	2025-01-25 00:11:00 +01:00
anthropic.py	Update webui assets	2025-03-22 00:36:38 +08:00
azure_openai.py	Resolve the issue with making API calls to Azure OpenAI service	2025-03-11 11:57:41 +08:00
bedrock.py	clean comments and unused libs	2025-02-18 21:12:06 +01:00
hf.py	fix hf_embed torch device use MPS or CPU when CUDA is not available -macos users	2025-03-20 09:40:56 +01:00
jina.py	clean comments and unused libs	2025-02-18 21:12:06 +01:00
llama_index_impl.py	Moved back to llm dir as per	2025-02-20 10:23:01 +01:00
lmdeploy.py	clean comments and unused libs	2025-02-18 21:12:06 +01:00
lollms.py	clean comments and unused libs	2025-02-18 21:12:06 +01:00
nvidia_openai.py	clean comments and unused libs	2025-02-18 21:12:06 +01:00
ollama.py	Fix ollama embedding func ruturn data type bugs	2025-04-21 00:01:25 +08:00
openai.py	Use the context manager for the openai client	2025-05-08 11:42:53 +10:00
Readme.md	Update LlamaIndex README: improve documentation and example paths	2025-02-20 10:33:15 +01:00
siliconcloud.py	clean comments and unused libs	2025-02-18 21:12:06 +01:00
zhipu.py	clean comments and unused libs	2025-02-18 21:12:06 +01:00

Readme.md

LlamaIndex (llm/llama_index.py):
- Provides integration with OpenAI and other providers through LlamaIndex
- Supports both direct API access and proxy services like LiteLLM
- Handles embeddings and completions with consistent interfaces
- See example implementations:
  - Direct OpenAI Usage
  - LiteLLM Proxy Usage

Using LlamaIndex

LightRAG supports LlamaIndex for embeddings and completions in two ways: direct OpenAI usage or through LiteLLM proxy.

Setup

First, install the required dependencies:

pip install llama-index-llms-litellm llama-index-embeddings-litellm

Standard OpenAI Usage

from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from lightrag.utils import EmbeddingFunc

# Initialize with direct OpenAI access
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    try:
        # Initialize OpenAI if not in kwargs
        if 'llm_instance' not in kwargs:
            llm_instance = OpenAI(
                model="gpt-4",
                api_key="your-openai-key",
                temperature=0.7,
            )
            kwargs['llm_instance'] = llm_instance

        response = await llama_index_complete_if_cache(
            kwargs['llm_instance'],
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            **kwargs,
        )
        return response
    except Exception as e:
        logger.error(f"LLM request failed: {str(e)}")
        raise

# Initialize LightRAG with OpenAI
rag = LightRAG(
    working_dir="your/path",
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=1536,
        max_token_size=8192,
        func=lambda texts: llama_index_embed(
            texts,
            embed_model=OpenAIEmbedding(
                model="text-embedding-3-large",
                api_key="your-openai-key"
            )
        ),
    ),
)

Using LiteLLM Proxy

Use any LLM provider through LiteLLM
Leverage LlamaIndex's embedding and completion capabilities
Maintain consistent configuration across services

from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.llms.litellm import LiteLLM
from llama_index.embeddings.litellm import LiteLLMEmbedding
from lightrag.utils import EmbeddingFunc

# Initialize with LiteLLM proxy
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    try:
        # Initialize LiteLLM if not in kwargs
        if 'llm_instance' not in kwargs:
            llm_instance = LiteLLM(
                model=f"openai/{settings.LLM_MODEL}",  # Format: "provider/model_name"
                api_base=settings.LITELLM_URL,
                api_key=settings.LITELLM_KEY,
                temperature=0.7,
            )
            kwargs['llm_instance'] = llm_instance

        response = await llama_index_complete_if_cache(
            kwargs['llm_instance'],
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            **kwargs,
        )
        return response
    except Exception as e:
        logger.error(f"LLM request failed: {str(e)}")
        raise

# Initialize LightRAG with LiteLLM
rag = LightRAG(
    working_dir="your/path",
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=1536,
        max_token_size=8192,
        func=lambda texts: llama_index_embed(
            texts,
            embed_model=LiteLLMEmbedding(
                model_name=f"openai/{settings.EMBEDDING_MODEL}",
                api_base=settings.LITELLM_URL,
                api_key=settings.LITELLM_KEY,
            )
        ),
    ),
)

Environment Variables

For OpenAI direct usage:

OPENAI_API_KEY=your-openai-key

For LiteLLM proxy:

# LiteLLM Configuration
LITELLM_URL=http://litellm:4000
LITELLM_KEY=your-litellm-key

# Model Configuration
LLM_MODEL=gpt-4
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_MAX_TOKEN_SIZE=8192

Key Differences

Direct OpenAI:
- Simpler setup
- Direct API access
- Requires OpenAI API key
LiteLLM Proxy:
- Model provider agnostic
- Centralized API key management
- Support for multiple providers
- Better cost control and monitoring