Vector Search
What is Vector Search?
Vector Search is an XAP
GigaSpaces eXtreme Application Platform.
Provides a powerful solution for data processing, launching, and running digital services extension that enables semantic similarity searches on high-dimensional vector embeddings. Instead of exact keyword matching, you can find conceptually similar items based on their vector representations.
Core Concepts
Vector Embeddings
Vector embeddings are numerical representations of data (text, images, etc.) converted into high-dimensional floating-point arrays. Similar concepts have vectors that are geometrically close to each other.
Similarity Metrics
Vector Search uses Cosine Similarity to measure how alike two vectors are:
- Score = 1.0: Perfect match (identical vectors)
- Score = 0.5: Moderately similar
- Score = 0.0: Completely unrelated
k-Nearest Neighbors (kNN)
The algorithm finds the k most similar vectors to your query vector. For example, "find the 10 most similar documents" means searching for k=10 nearest neighbors. XAP uses efficient HNSW (Hierarchical Navigable Small World) indexing powered by Apache Lucene.
Distributed Search
When your data is partitioned across multiple XAP nodes, Vector Search automatically:
- Executes kNN search in parallel on each partition
- Merges results from all partitions
- Returns globally ranked top-k results
Setup & Configuration
Step 1: Add Dependency
Add the Vector Search module to your pom.xml:
<dependency>
<groupId>org.openspaces</groupId>
<artifactId>xap-vector-search</artifactId>
<version>17.2.2</version>
</dependency>
Step 2: Annotate Vector Fields
Mark vector fields with the @SpaceVectorIndex annotation:
import org.openspaces.vectorsearch.SpaceVectorIndex;
@SpaceClass
public class Article {
private String id;
private String title;
private float[] embedding;
@SpaceId
public String getId() {
return id;
}
@SpaceVectorIndex(dimension = 384)
public float[] getEmbedding() {
return embedding;
}
}
The dimension must match your embedding model:
- 384 for all-MiniLM-L6-v2
- 768 for all-mpnet-base-v2
- 1536 for OpenAI text-embedding-3-small
- 3072 for OpenAI text-embedding-3-large
Step 3: Configure Space
Register the Vector Search Query Extension with your space:
import org.openspaces.vectorsearch.VectorSearchQueryExtensionProvider;
EmbeddedSpaceConfigurer spaceConfigurer =
new EmbeddedSpaceConfigurer("mySpace");
// Register vector search extension
spaceConfigurer.addQueryExtensionProvider(
new VectorSearchQueryExtensionProvider());
GigaSpace gigaSpace = new GigaSpaceConfigurer(spaceConfigurer).gigaSpace();
<bean id="space" class="org.openspaces.core.space.EmbeddedSpaceFactoryBean">
<property name="spaceName" value="vector-space"/>
<property name="queryExtensionProviders">
<list>
<bean class="org.openspaces.vectorsearch.VectorSearchQueryExtensionProvider"/>
</list>
</property>
</bean>
Basic Usage
Simple Vector Search
Find the 10 most similar articles:
import org.openspaces.vectorsearch.VectorQuery;
import com.j_spaces.core.client.SQLQuery;
// Create a query vector (e.g., from an embedding model)
float[] queryVector = new float[] {0.1f, 0.2f, ..., -0.3f}; // 384 dimensions
// Create vector query with k=10
VectorQuery vq = new VectorQuery(queryVector, 10);
// Execute SQL query with vector:match predicate
SQLQuery<Article> query =
new SQLQuery<>(Article.class, "embedding vector:match ?");
query.setParameter(1, vq);
// Get results
Article[] results = gigaSpace.readMultiple(query);
Search with Filters
Combine vector similarity with traditional predicates:
float[] queryVector = ...;
SQLQuery<Article> query = new SQLQuery<>(
Article.class,
"category = ? AND publishDate > ? AND embedding vector:match ?");
query.setParameter(1, "Technology");
query.setParameter(2, yesterday);
query.setParameter(3, new VectorQuery(queryVector, 10));
Article[] results = gigaSpace.readMultiple(query);
Distributed Search
For partitioned spaces, use DistributedVectorSearch for global ranking:
import org.openspaces.vectorsearch.DistributedVectorSearch;
SQLQuery<Article> query =
new SQLQuery<>(Article.class, "embedding vector:match ?");
query.setParameter(1, new VectorQuery(queryVector, 10));
// Executes on all partitions, merges results, returns top 10
List<Article> results = DistributedVectorSearch
.search(gigaSpace, Article.class, query);
Always use DistributedVectorSearch for clustered or partitioned spaces to ensure globally correct ranking.
Advanced Features
Understanding k (Number of Results)
The k parameter in VectorQuery controls the maximum number of results returned:
// Returns at most 10 results
VectorQuery vq = new VectorQuery(queryVector, 10);
// The readMultiple limit (50) is ignored if k is smaller
Article[] results = gigaSpace.readMultiple(query, 50);
// Returns 10 results, not 50
The k parameter takes precedence over readMultiple() maxResults. This ensures the HNSW algorithm knows exactly how many neighbors to retrieve.
Handling Deleted Objects (Race Conditions)
Between index search and object fetch, entries may be deleted. DistributedVectorSearch automatically handles this with a backfill strategy:
- Requests k × 2 results from the index
- Fetches objects by UID
- If an object is deleted (null), fetches the next candidate
- Returns up to k available objects
Vector Dimensionality
Common embedding dimensions:
| Model | Dimension | Use Case |
|---|---|---|
| all-MiniLM-L6-v2 | 384 | Lightweight, fast embeddings |
| all-mpnet-base-v2 | 768 | Balanced quality and performance |
| OpenAI text-embedding-3-small | 1536 | High-quality semantic embeddings |
| OpenAI text-embedding-3-large | 3072 | Top-tier semantic understanding |
Index Storage
Vector indexes are stored using Apache Lucene with memory-mapped I/O (MMapDirectory):
- Off-heap storage (does not consume Java heap)
- Efficient disk caching
- Scalability
The ability of a system to handle increased load by adding resources, such as processing power or storage. Scalability ensures that the system can grow with the demands placed on it. to very large indexes
Performance & Scalability
Performance Characteristics
| Metric | Value | Notes |
|---|---|---|
| Search Complexity | O(log n) average | HNSW algorithm |
| Index Size | ~4 bytes per dimension per entry | 384-dim vectors ≈ 1.5 KB each |
| Network Transfer | ~50–100 bytes per result | UID + score only, not full object |
| Insertion | O(log n) operations | Batched auto-commits every 1000 changes |
Scalability Tips
- Use Partitioning: Split vectors across nodes for parallel search.
- Optimize k: Return only what you need (k=10 is much faster than k=1000).
- Batch Operations: Write vectors in batches to reduce index commits.
- Monitor Index: Track uncommitted changes and trigger commits when needed.
Network Efficiency
For distributed search with k=10 and 3 partitions:
- Network transfer: ~3 KB (30 ScoredResults × ~100 bytes)
- vs. Sending full objects with vectors: ~45 KB
- 93% reduction in network bandwidth
Common Use Cases
Recommendation Engine
Find similar products or articles based on user preferences:
// User liked this article; find similar ones
Article liked = gigaSpace.readById(Article.class, userId);
VectorQuery vq = new VectorQuery(liked.getEmbedding(), 5);
SQLQuery<Article> query =
new SQLQuery<>(Article.class,
"category = ? AND embedding vector:match ?");
query.setParameter(1, liked.getCategory());
query.setParameter(2, vq);
List<Article> recommendations =
DistributedVectorSearch.search(gigaSpace, Article.class, query);
Semantic Search
Search by meaning, not keywords:
// Get embedding for user query (using LLM or embedding model)
String userQuery = "best machine learning frameworks";
float[] queryEmbedding = embeddingModel.embed(userQuery);
VectorQuery vq = new VectorQuery(queryEmbedding, 20);
SQLQuery<Article> query =
new SQLQuery<>(Article.class, "embedding vector:match ?");
query.setParameter(1, vq);
List<Article> results =
DistributedVectorSearch.search(gigaSpace, Article.class, query);
Duplicate Detection
Find near-duplicate documents:
// Find documents with >0.95 similarity
List<Article> candidates =
DistributedVectorSearch.search(gigaSpace, Article.class, query);
for (Article candidate : candidates) {
float similarity = computeCosineSimilarity(original.getEmbedding(),
candidate.getEmbedding());
if (similarity > 0.95f) {
// Mark as potential duplicate
}
}
Frequently Asked Questions
Do I need to normalize vectors?
Cosine similarity implicitly treats vectors as normalized. While XAP works with non-normalized vectors, it is best practice to normalize them before indexing for better performance and consistency with embedding model outputs.
What happens if I index vectors with wrong dimensions?
XAP validates dimensions at index time. If you try to index a vector with 768 dimensions into a field declared as dimension=384, you will get an IllegalArgumentException. Ensure your embedding model output matches the annotated dimension.
Can I search multiple vector fields?
XAP currently supports one vector field per query. For multiple semantic dimensions, either combine embeddings into a single vector or execute separate queries.
How do I update vectors?
Vectors are indexed automatically on write operations. Simply write a new version of the object with an updated vector:
Article updated = gigaSpace.readById(Article.class, "article-1");
updated.setEmbedding(newVector);
gigaSpace.write(updated); // Vector index updated automatically
What about memory usage?
Vectors are stored in Lucene indexes on disk (with memory mapping), not in XAP's heap. One million 384-dimensional vectors requires roughly 1.5 GB on disk, with the working set in memory determined by the OS page cache.
Can I combine vector search with transactions?
Vector indexes are updated atomically with the main data. Reading from a vector index respects space snapshots, ensuring consistent results.
What is the difference between VectorQuery and DistributedVectorSearch?
VectorQuery— Low-level building block; searches the local partition's index.DistributedVectorSearch— High-level API; executes on all partitions, merges results, and returns global top-k.
Use DistributedVectorSearch for partitioned spaces.
Can I filter results after vector search?
Yes. Use SQL predicates in the query itself for efficient pre-filtering:
"category = 'tech' AND publishDate > ? AND embedding vector:match ?"
Filtering in the query is more efficient than fetching all results and filtering in application code.
In-Memory Data Grid - achieve unparalleled speed, persistence, and accuracy.