跳轉到

語意搜尋

RAG 管線的查詢端提供混合搜尋能力:向量相似度搜尋(ANN)捕捉語意近似,BM25 關鍵字搜尋捕捉精確術語,RRF(Reciprocal Rank Fusion)融合兩者排名,產生最相關的段落結果。


端點

POST /v1/rag/query

執行語意搜尋,返回最相關的文件段落及其引用位置。

請求

POST /v1/rag/query
Authorization: Bearer {jwt_token}
X-Tenant-ID: acme-corp-001
Content-Type: application/json

{
  "query": "What are the termination conditions for the service agreement?",
  "top_k": 5,
  "search_mode": "hybrid",
  "filters": {
    "document_ids": ["contract-2025-001", "contract-2025-002"],
    "tags": ["legal", "contract"],
    "metadata": {
      "document_type": "legal_contract"
    }
  },
  "hybrid_config": {
    "vector_weight": 0.6,
    "bm25_weight": 0.4,
    "rrf_k": 60
  },
  "include_context": true,
  "context_window_tokens": 128
}

回應

{
  "query": "What are the termination conditions for the service agreement?",
  "passages": [
    {
      "chunk_id": "c042",
      "document_id": "contract-2025-001",
      "text": "Either party may terminate this Agreement upon 30 days written notice...",
      "context_before": "...as defined in Section 8 of this Agreement.",
      "context_after": "Termination for cause may occur immediately upon written notice...",
      "score": 0.9234,
      "vector_score": 0.8912,
      "bm25_score": 0.7654,
      "citation": {
        "document_id": "contract-2025-001",
        "document_title": "Service Agreement 2025-001",
        "page_number": 8,
        "section": "9. Termination",
        "headings": ["9. Termination", "9.1 Termination for Convenience"]
      }
    }
  ],
  "total_found": 12,
  "returned": 5,
  "query_vector_ms": 3,
  "search_ms": 18,
  "total_ms": 21
}

搜尋模式

純向量搜尋(vector

使用 CAGRA 索引的 ANN 搜尋,適用於概念性查詢:

{
  "query": "renewable energy investment strategy",
  "search_mode": "vector",
  "top_k": 10
}

純 BM25 搜尋(bm25

傳統關鍵字搜尋,適用於精確術語查詢:

{
  "query": "21 CFR 11.10(d) audit trail requirements",
  "search_mode": "bm25",
  "top_k": 10
}

混合搜尋(hybrid,推薦)

RRF 融合排名,綜合兩者優勢:

{
  "query": "What retention period is required for audit trails?",
  "search_mode": "hybrid",
  "hybrid_config": {
    "vector_weight": 0.6,
    "bm25_weight": 0.4,
    "rrf_k": 60
  }
}

RRF 融合排名

RRF(Reciprocal Rank Fusion)是一種無參數的排名融合方法:

RRF_Score(d) = Σ [ 1 / (k + rank_i(d)) ]
  • k = 60(標準值,防止頂部排名過度主導)
  • rank_i(d):文件 d 在第 i 個排名列表中的位置(從 1 起算)
  • 分數越高表示在多個列表中排名均靠前

PHP 客戶端

use NextPDF\Enterprise\AiRag\RagClient;
use NextPDF\Enterprise\AiRag\QueryRequest;
use NextPDF\Enterprise\AiRag\SearchMode;
use NextPDF\Enterprise\AiRag\HybridConfig;
use NextPDF\Enterprise\AiRag\QueryFilter;

$client = RagClient::fromEnvironment();

$results = $client->query(
    QueryRequest::create(query: 'What are the termination conditions?')
        ->withTopK(5)
        ->withSearchMode(SearchMode::Hybrid)
        ->withHybridConfig(
            HybridConfig::create(
                vectorWeight: 0.6,
                bm25Weight: 0.4,
                rrfK: 60,
            )
        )
        ->withFilter(
            QueryFilter::create()
                ->documentIds(['contract-2025-001'])
                ->tags(['legal'])
        )
        ->withContextWindow(tokens: 128)
);

foreach ($results->passages() as $passage) {
    echo $passage->text();
    echo "\nSource: " . $passage->citation()->documentTitle();
    echo ' p.' . $passage->citation()->pageNumber();
    echo "\nScore: " . number_format($passage->score(), 4);
    echo "\n\n";
}

echo 'Query time: ' . $results->totalMs() . 'ms';

PHP Compatibility

This example uses PHP 8.5 syntax. If your environment runs PHP 8.1 or 7.4, use NextPDF Backport for a backward-compatible build.


結果引用格式

每個搜尋結果包含完整的引用資訊,適合直接用於 LLM 提示工程:

use NextPDF\Enterprise\AiRag\CitationFormatter;

$formatter = CitationFormatter::create();

// 格式化為 LLM 可用的上下文塊
$contextBlock = $formatter->formatForLlm(
    passages: $results->passages(),
    format: CitationFormat::AcademicInline, // [Source: document-title, p.8]
);

// 或 Markdown 格式
$markdown = $formatter->formatAsMarkdown($results->passages());

// 或 JSON-LD 引用
$jsonLd = $formatter->formatAsJsonLd($results->passages());

索引統計

GET /v1/rag/index/stats

GET /v1/rag/index/stats
Authorization: Bearer {jwt_token}
X-Tenant-ID: acme-corp-001
{
  "tenant_id": "acme-corp-001",
  "total_documents": 1247,
  "total_chunks": 89423,
  "total_vectors": 89423,
  "index_size_bytes": 734003200,
  "vector_dimension": 1024,
  "index_type": "CAGRA",
  "bm25_index_size_bytes": 45678901,
  "last_updated": "2025-01-15T09:30:00Z",
  "query_count_24h": 3421,
  "avg_query_latency_ms": 22
}

效能規格

場景 指標
混合搜尋 p50(100K 向量索引)
混合搜尋 p99(100K 向量索引)
混合搜尋 p50(1M 向量索引)
查詢嵌入延遲(單一查詢)

延伸閱讀