Files
Blizzard 84d1a1dd3a feat: RAG 核心链 — embedding(provider) + Milvus 真连 + 入库/检索
mcp-go 接通向量 RAG:embedding(OpenAI 兼容 provider 抽象) + Milvus 真实连接,
kb_ingest 入库、wiki_search 真检索。retriever 节点一行不改即从桩变真。

- mcp-go internal/rag: embed.go(OpenAI 兼容 /embeddings 客户端) + milvus.go(milvus-sdk-go
  真连,集合按首次 embedding 维度懒建+AUTOINDEX/COSINE索引+加载,insert/向量search) +
  rag.go(Engine: 切块→embed→insert / embed query→search;embedding 或 Milvus 缺则降级)
- mcp-go gateway: 新工具 kb_ingest,wiki_search 换真(RAG 向量检索,kb 过滤 topK)
- mcp-go main: rag.Open 读 MILVUS_ADDR/EMBED_BASE_URL/EMBED_API_KEY/EMBED_MODEL 环境变量
- gateway: POST /api/v1/kb/ingest → kb_ingest(供知识库页/脚本)
- scripts/mock_embeddings.py: 确定性词法向量(字+bigram 哈希),无真 key 验证检索
- 开发期 embedding 接在线 API(无真 key 用 mock),见 llm-provider-strategy
- 验证: 全模块 build✓ + e2e PASS; live——入库5条→Milvus;retriever 节点查'向量数据库'
  →召回 Milvus 那条→DeepSeek 答'Milvus';查'知识图谱'→Neo4j(向量检索区分正确)

注: 当前向量单路;Bleve/Neo4j 融合 + rerank + 真实语义 embedding 为后续。

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-10 17:07:36 +08:00

66 lines
2.4 KiB
Python

#!/usr/bin/env python3
"""最小 OpenAI 兼容 embeddings mock —— 无真实 embedding key 时验证 RAG 链路。
确定性"词法向量":把文本的字 + 字bigram 哈希进固定维向量并归一化,
词面重叠越多余弦越高(足以演示"查询召回相关文档")。真实语义需插真实 embedding API。
GET /models → 200
POST /embeddings → {data:[{embedding:[...]}]}
用法: python3 scripts/mock_embeddings.py 11888
"""
import hashlib
import json
import math
import sys
from http.server import BaseHTTPRequestHandler, HTTPServer
DIM = 256
def embed(text: str):
vec = [0.0] * DIM
t = text.strip()
toks = list(t) + [t[i:i + 2] for i in range(len(t) - 1)] # 字 + 字bigram
for tok in toks:
h = int(hashlib.md5(tok.encode("utf-8")).hexdigest(), 16)
vec[h % DIM] += 1.0
norm = math.sqrt(sum(x * x for x in vec)) or 1.0
return [x / norm for x in vec]
class H(BaseHTTPRequestHandler):
def log_message(self, *a):
pass
def do_GET(self):
if self.path.endswith("/models"):
body = json.dumps({"object": "list", "data": [{"id": "mock-embed"}]}).encode()
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(body)))
self.end_headers()
self.wfile.write(body)
else:
self.send_response(404)
self.end_headers()
def do_POST(self):
n = int(self.headers.get("Content-Length", 0))
req = json.loads(self.rfile.read(n) or b"{}")
inp = req.get("input", [])
if isinstance(inp, str):
inp = [inp]
data = [{"object": "embedding", "index": i, "embedding": embed(t)} for i, t in enumerate(inp)]
body = json.dumps({"object": "list", "data": data, "model": req.get("model", "mock-embed")}).encode()
sys.stderr.write(f"[mock-embed] /embeddings n={len(inp)} dim={DIM}\n")
sys.stderr.flush()
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(body)))
self.end_headers()
self.wfile.write(body)
if __name__ == "__main__":
port = int(sys.argv[1]) if len(sys.argv) > 1 else 11888
print(f"[mock-embed] listening on :{port} (dim={DIM})")
HTTPServer(("127.0.0.1", port), H).serve_forever()