AI Semantic Search
AI semantic search is an intelligent search method provided by Cloudreve based on Meilisearch. Unlike traditional full-text search, semantic search returns results based on the meaning and context of the query, rather than relying solely on keyword matching.
How It Works
Semantic search generates vector embeddings through LLM providers (such as OpenAI, Hugging Face, etc.), converting document content and search terms into semantic vectors. It then finds semantically relevant search results by comparing vector similarity.
When semantic search is enabled, Cloudreve returns both full-text search and semantic search results by default (i.e., hybrid search), complementing each other to provide a more accurate search experience.
Full-text Search vs AI Semantic Search
| Full-text Search | AI Semantic Search | |
|---|---|---|
| Matching | Exact keyword matching | Semantic similarity matching |
| Use Case | User knows the exact keywords, needs precise matching | Query is vague or lengthy, needs semantic understanding |
| Resource Usage | Low | May incur third-party model costs |
| Search Example | Searching "travel plan" only matches documents containing that keyword | Searching "travel plan" also matches documents containing "trip arrangement", "vacation itinerary", and other semantically related terms |
Prerequisites
- Full-text search is already enabled in Cloudreve.
- Choose a suitable embedding model and provider, such as OpenAI, Hugging Face, etc. For guidance on choosing a model, refer to the Meilisearch documentation.
Enable AI Hybrid Search
Navigate to Cloudreve admin panel -> Filesystem -> Full-text Search -> AI Semantic Search to enable AI semantic search, and fill in the embedding configuration based on your model provider:
Currently, OpenAI offers the following three mainstream embedding models:
text-embedding-3-large: 3,072 dimensionstext-embedding-3-small: 1,536 dimensionstext-embedding-ada-002: 1,536 dimensions
Fill in the following embedding configuration in Cloudreve:
{
"source": "openAi",
"apiKey": "<OpenAI API Key>",
"dimensions": 1536,
"model": "text-embedding-3-small"
}Where:
apiKeyis your OpenAI API key.dimensionsis the dimension of the embedding model, e.g.,1536.modelis the name of the embedding model, e.g.,text-embedding-3-small.
Reference: Meilisearch OpenAI Embedding Configuration
After saving the settings, there is no need to rebuild the index. Meilisearch will automatically generate embedding vectors for existing indexes using the embedding model.