Fix double decoration in azure_openai_embed and document decorator usage

• Remove redundant @retry decorator • Call openai_embed.func directly • Add detailed decorator documentation • Prevent double parameter injection • Fix EmbeddingFunc wrapping issues
2025-11-21 18:03:53 +08:00 · 2025-11-21 18:03:53 +08:00 · 0c4cba3860
commit 0c4cba3860
parent b46c152306
2 changed files with 102 additions and 10 deletions
--- a/lightrag/llm/openai.py
+++ b/lightrag/llm/openai.py
@ -815,13 +815,6 @@ async def azure_openai_complete(
@wrap_embedding_func_with_attrs(embedding_dim=1536)
@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=4, max=10),
    retry=retry_if_exception_type(
        (RateLimitError, APIConnectionError, APITimeoutError)
    ),
 )
 async def azure_openai_embed(
    texts: list[str],
    model: str | None = None,
@ -833,6 +826,35 @@ async def azure_openai_embed(
    This function provides backward compatibility by wrapping the unified
    openai_embed implementation with Azure-specific parameter handling.
    IMPORTANT - Decorator Usage:
    1. This function is decorated with @wrap_embedding_func_with_attrs to provide
       the EmbeddingFunc interface for users who need to access embedding_dim
       and other attributes.
    2. This function does NOT use @retry decorator to avoid double-wrapping,
       since the underlying openai_embed.func already has retry logic.
    3. This function calls openai_embed.func (the unwrapped function) instead of
       openai_embed (the EmbeddingFunc instance) to avoid double decoration issues:
       ✅ Correct: await openai_embed.func(...)  # Calls unwrapped function with retry
       ❌ Wrong:   await openai_embed(...)       # Would cause double EmbeddingFunc wrapping
    Double decoration causes:
    - Double injection of embedding_dim parameter
    - Incorrect parameter passing to the underlying implementation
    - Runtime errors due to parameter conflicts
    The call chain with correct implementation:
    azure_openai_embed(texts)
    → EmbeddingFunc.__call__(texts)              # azure's decorator
      → azure_openai_embed_impl(texts, embedding_dim=1536)
        → openai_embed.func(texts, ...)
          → @retry_wrapper(texts, ...)           # openai's retry (only one layer)
            → openai_embed_impl(texts, ...)
              → actual embedding computation
    """
    # Handle Azure-specific environment variables and parameters
    deployment = (
@ -856,8 +878,9 @@ async def azure_openai_embed(
        or os.getenv("OPENAI_API_VERSION")
    )
-    # Call the unified implementation with Azure-specific parameters
+    # CRITICAL: Call openai_embed.func (unwrapped) to avoid double decoration
-    return await openai_embed(
+    # openai_embed is an EmbeddingFunc instance, .func accesses the underlying function
    return await openai_embed.func(
        texts=texts,
        model=model or deployment,
        base_url=base_url,
--- a/lightrag/utils.py
+++ b/lightrag/utils.py
@ -1005,7 +1005,76 @@ def priority_limit_async_func_call(
 def wrap_embedding_func_with_attrs(**kwargs):
-    """Wrap a function with attributes"""
+    """Decorator to add embedding dimension and token limit attributes to embedding functions.
    This decorator wraps an async embedding function and returns an EmbeddingFunc instance
    that automatically handles dimension parameter injection and attribute management.
    WARNING: DO NOT apply this decorator to wrapper functions that call other
    decorated embedding functions. This will cause double decoration and parameter
    injection conflicts.
    Correct usage patterns:
    1. Direct implementation (decorated):
        ```python
        @wrap_embedding_func_with_attrs(embedding_dim=1536)
        async def my_embed(texts, embedding_dim=None):
            # Direct implementation
            return embeddings
        ```
    2. Wrapper calling decorated function (DO NOT decorate wrapper):
        ```python
        # my_embed is already decorated above
        async def my_wrapper(texts, **kwargs):  # ❌ DO NOT decorate this!
            # Must call .func to access unwrapped implementation
            return await my_embed.func(texts, **kwargs)
        ```
    3. Wrapper calling decorated function (properly decorated):
        ```python
        @wrap_embedding_func_with_attrs(embedding_dim=1536)
        async def my_wrapper(texts, **kwargs):  # ✅ Can decorate if calling .func
            # Calling .func avoids double decoration
            return await my_embed.func(texts, **kwargs)
        ```
    The decorated function becomes an EmbeddingFunc instance with:
    - embedding_dim: The embedding dimension
    - max_token_size: Maximum token limit (optional)
    - func: The original unwrapped function (access via .func)
    - __call__: Wrapper that injects embedding_dim parameter
    Double decoration causes:
    - Double injection of embedding_dim parameter
    - Incorrect parameter passing to the underlying implementation
    - Runtime errors due to parameter conflicts
    Args:
        embedding_dim: The dimension of embedding vectors
        max_token_size: Maximum number of tokens (optional)
        send_dimensions: Whether to inject embedding_dim as a keyword argument (optional)
    Returns:
        A decorator that wraps the function as an EmbeddingFunc instance
    Example of correct wrapper implementation:
        ```python
        @wrap_embedding_func_with_attrs(embedding_dim=1536, max_token_size=8192)
        @retry(...)
        async def openai_embed(texts, ...):
            # Base implementation
            pass
        @wrap_embedding_func_with_attrs(embedding_dim=1536)  # Note: No @retry here!
        async def azure_openai_embed(texts, ...):
            # CRITICAL: Call .func to access unwrapped function
            return await openai_embed.func(texts, ...)  # ✅ Correct
            # return await openai_embed(texts, ...)     # ❌ Wrong - double decoration!
        ```
    """
    def final_decro(func) -> EmbeddingFunc:
        new_func = EmbeddingFunc(**kwargs, func=func)