diff --git a/README.md b/README.md
index 4deeaccf..16f36d7f 100644
--- a/README.md
+++ b/README.md
@@ -523,6 +523,8 @@ reranker, leveraging Gemini's log probabilities feature to rank passage relevanc
 Graphiti supports Ollama for running local LLMs and embedding models via Ollama's OpenAI-compatible API. This is ideal
 for privacy-focused applications or when you want to avoid API costs.
 
+**Note:** Use `OpenAIGenericClient` (not `OpenAIClient`) for Ollama and other OpenAI-compatible providers like LM Studio. The `OpenAIGenericClient` is optimized for local models with a higher default max token limit (16K vs 8K) and full support for structured outputs.
+
 Install the models:
 
 ```bash
diff --git a/graphiti_core/llm_client/openai_generic_client.py b/graphiti_core/llm_client/openai_generic_client.py
index 5493c55a..50ad68a3 100644
--- a/graphiti_core/llm_client/openai_generic_client.py
+++ b/graphiti_core/llm_client/openai_generic_client.py
@@ -77,6 +77,10 @@ class OpenAIGenericClient(LLMClient):
         if config is None:
             config = LLMConfig()
 
+        # Override max_tokens default to 16K for better compatibility with local models
+        if config.max_tokens == DEFAULT_MAX_TOKENS:
+            config.max_tokens = 16384
+
         super().__init__(config, cache)
 
         if client is None: