openclaw - 💡(How to fix) Fix [Feature Request] Multi-Model Support for Memory Search Embeddings with Fallback [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#55590Fetched 2026-04-08 01:37:36
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
1
Timeline (top)
commented ×1subscribed ×1

Fix Action

Fix / Workaround

Current Workaround

Code Example

"memorySearch": {
  "provider": "openai",
  "model": "BAAI/bge-m3",
  "remote": {
    "apiKey": "sk-xxx",
    "baseUrl": "https://api.siliconflow.cn/v1"
  }
}

---

"memorySearch": {
  "primary": {
    "provider": "openai",
    "model": "BAAI/bge-m3",
    "remote": { "apiKey": "${KEY1}", "baseUrl": "..." }
  },
  "fallbacks": [
    {
      "provider": "openai", 
      "model": "Qwen/Qwen2.5-7B-Instruct",
      "remote": { "apiKey": "${KEY2}", "baseUrl": "..." }
    }
  ]
}

---

"agents": {
  "list": [
    {
      "id": "main",
      "memorySearch": {
        "model": "BAAI/bge-m3"
      }
    },
    {
      "id": "research",
      "memorySearch": {
        "model": "Qwen/Qwen2.5-7B-Instruct"
      }
    }
  ]
}

---

"memorySearch": {
  "strategy": "adaptive",
  "models": [
    { "name": "bge-m3", "threshold": 4096 },
    { "name": "Qwen2.5", "threshold": 32768 }
  ]
}
RAW_BUFFERClick to expand / collapse

Problem Statement

Currently, memorySearch only supports a single embedding model configuration without fallback options:

"memorySearch": {
  "provider": "openai",
  "model": "BAAI/bge-m3",
  "remote": {
    "apiKey": "sk-xxx",
    "baseUrl": "https://api.siliconflow.cn/v1"
  }
}

Unlike the main LLM model which supports primary + fallbacks configuration, memory search lacks:

  • Fallback models when primary embedding service fails
  • Runtime model switching based on query complexity
  • Per-agent embedding model customization

Use Cases

1. Cost Optimization

Use lightweight model (e.g., BAAI/bge-m3) for routine queries, fallback to powerful model (e.g., Qwen/Qwen2.5-7B-Instruct) for complex semantic understanding.

2. Multi-Agent Scenarios

Different agents need different embedding capabilities:

  • General agent: Short context (bge-m3, 8K)
  • Research agent: Long documents (Qwen2.5, 32K)
  • Multimodal agent: Image+text (Qwen2-VL)

3. Provider Redundancy

Switch to backup provider when primary embedding service is down or rate-limited.

Proposed Solutions

Option A: Primary + Fallbacks (Preferred)

Similar to main model configuration:

"memorySearch": {
  "primary": {
    "provider": "openai",
    "model": "BAAI/bge-m3",
    "remote": { "apiKey": "${KEY1}", "baseUrl": "..." }
  },
  "fallbacks": [
    {
      "provider": "openai", 
      "model": "Qwen/Qwen2.5-7B-Instruct",
      "remote": { "apiKey": "${KEY2}", "baseUrl": "..." }
    }
  ]
}

Option B: Per-Agent Configuration

"agents": {
  "list": [
    {
      "id": "main",
      "memorySearch": {
        "model": "BAAI/bge-m3"
      }
    },
    {
      "id": "research",
      "memorySearch": {
        "model": "Qwen/Qwen2.5-7B-Instruct"
      }
    }
  ]
}

Option C: Dynamic Model Selection (Advanced)

"memorySearch": {
  "strategy": "adaptive",
  "models": [
    { "name": "bge-m3", "threshold": 4096 },
    { "name": "Qwen2.5", "threshold": 32768 }
  ]
}

Current Workaround

Manually edit openclaw.json and restart Gateway to switch models.

Environment

  • OpenClaw Version: 2026.03.24
  • Current Config: Using Siliconflow BAAI/bge-m3 via OpenAI-compatible API

Additional Context

With providers like Siliconflow offering multiple embedding models (bge-m3, Qwen series, etc.), users need flexibility to optimize cost vs quality trade-offs and handle different context length requirements.

extent analysis

Fix Plan

To implement fallback models and runtime model switching for memory search, we will use Option A: Primary + Fallbacks. Here are the steps:

  • Update the memorySearch configuration in openclaw.json to include primary and fallbacks:
"memorySearch": {
  "primary": {
    "provider": "openai",
    "model": "BAAI/bge-m3",
    "remote": { "apiKey": "${KEY1}", "baseUrl": "..." }
  },
  "fallbacks": [
    {
      "provider": "openai", 
      "model": "Qwen/Qwen2.5-7B-Instruct",
      "remote": { "apiKey": "${KEY2}", "baseUrl": "..." }
    }
  ]
}
  • Implement a fallback mechanism in the code to switch to the fallback model when the primary model fails:
def memory_search(query):
    try:
        # Use primary model
        response = primary_model.search(query)
        return response
    except Exception as e:
        # Fallback to secondary model
        print(f"Primary model failed: {e}")
        response = fallback_model.search(query)
        return response
  • To implement runtime model switching based on query complexity, add a threshold parameter to the memorySearch configuration:
"memorySearch": {
  "strategy": "adaptive",
  "models": [
    { "name": "bge-m3", "threshold": 4096 },
    { "name": "Qwen2.5", "threshold": 32768 }
  ]
}
  • Update the code to select the model based on the query length:
def memory_search(query):
    query_length = len(query)
    for model in models:
        if query_length <= model["threshold"]:
            response = model["name"].search(query)
            return response
    # Default to the last model if query length exceeds all thresholds
    response = models[-1]["name"].search(query)
    return response

Verification

To verify that the fix worked, test the memory search functionality with different query lengths and complexity levels. Check that the fallback model is used when the primary model fails, and that the correct model is selected based on the query length.

Extra Tips

  • Make sure to update the openclaw.json configuration file and restart the Gateway for the changes to take effect.
  • Consider implementing a more sophisticated model selection strategy, such as using a machine learning model to predict the best model for a given query.
  • Monitor the performance and cost of the different models to optimize the configuration and ensure the best trade-off between cost and quality.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING