XAP

Vector Search

What is Vector Search?

Vector Search is an XAPClosed GigaSpaces eXtreme Application Platform. Provides a powerful solution for data processing, launching, and running digital services extension that enables semantic similarity searches on high-dimensional vector embeddings. Instead of exact keyword matching, you can find conceptually similar items based on their vector representations.

Core Concepts

Vector Embeddings

Vector embeddings are numerical representations of data (text, images, etc.) converted into high-dimensional floating-point arrays. Similar concepts have vectors that are geometrically close to each other.

Similarity Metrics

Vector Search uses Cosine Similarity to measure how alike two vectors are:

  • Score = 1.0: Perfect match (identical vectors)
  • Score = 0.5: Moderately similar
  • Score = 0.0: Completely unrelated

k-Nearest Neighbors (kNN)

The algorithm finds the k most similar vectors to your query vector. For example, "find the 10 most similar documents" means searching for k=10 nearest neighbors. XAP uses efficient HNSW (Hierarchical Navigable Small World) indexing powered by Apache Lucene.

Distributed Search

When your data is partitioned across multiple XAP nodes, Vector Search automatically:

  1. Executes kNN search in parallel on each partition
  2. Merges results from all partitions
  3. Returns globally ranked top-k results

Setup & Configuration

Step 1: Add Dependency

Add the Vector Search module to your pom.xml:

<dependency>
    <groupId>org.openspaces</groupId>
    <artifactId>xap-vector-search</artifactId>
    <version>17.2.2</version>
</dependency>

Step 2: Annotate Vector Fields

Mark vector fields with the @SpaceVectorIndex annotation:

import org.openspaces.vectorsearch.SpaceVectorIndex;

@SpaceClass
public class Article {
    private String id;
    private String title;
    private float[] embedding;

    @SpaceId
    public String getId() {
        return id;
    }

    @SpaceVectorIndex(dimension = 384)
    public float[] getEmbedding() {
        return embedding;
    }
}

The dimension must match your embedding model:

  • 384 for all-MiniLM-L6-v2
  • 768 for all-mpnet-base-v2
  • 1536 for OpenAI text-embedding-3-small
  • 3072 for OpenAI text-embedding-3-large

Step 3: Configure Space

Register the Vector Search Query Extension with your space:

import org.openspaces.vectorsearch.VectorSearchQueryExtensionProvider;

EmbeddedSpaceConfigurer spaceConfigurer =
    new EmbeddedSpaceConfigurer("mySpace");

// Register vector search extension
spaceConfigurer.addQueryExtensionProvider(
    new VectorSearchQueryExtensionProvider());

GigaSpace gigaSpace = new GigaSpaceConfigurer(spaceConfigurer).gigaSpace();

Using PUClosed This is the unit of packaging and deployment in the GigaSpaces Data Grid, and is essentially the main GigaSpaces service. The Processing Unit (PU) itself is typically deployed onto the Service Grid. When a Processing Unit is deployed, a Processing Unit instance is the actual runtime entity.:

<bean id="space" class="org.openspaces.core.space.EmbeddedSpaceFactoryBean">
    <property name="spaceName" value="vector-space"/>
    <property name="queryExtensionProviders">
        <list>
            <bean class="org.openspaces.vectorsearch.VectorSearchQueryExtensionProvider"/>
        </list>
    </property>
</bean>

Basic Usage

Simple Vector Search

Find the 10 most similar articles:

import org.openspaces.vectorsearch.VectorQuery;
import com.j_spaces.core.client.SQLQuery;

// Create a query vector (e.g., from an embedding model)
float[] queryVector = new float[] {0.1f, 0.2f, ..., -0.3f};  // 384 dimensions

// Create vector query with k=10
VectorQuery vq = new VectorQuery(queryVector, 10);

// Execute SQL query with vector:match predicate
SQLQuery<Article> query =
    new SQLQuery<>(Article.class, "embedding vector:match ?");
query.setParameter(1, vq);

// Get results
Article[] results = gigaSpace.readMultiple(query);

Search with Filters

Combine vector similarity with traditional predicates:

float[] queryVector = ...;

SQLQuery<Article> query = new SQLQuery<>(
    Article.class,
    "category = ? AND publishDate > ? AND embedding vector:match ?");

query.setParameter(1, "Technology");
query.setParameter(2, yesterday);
query.setParameter(3, new VectorQuery(queryVector, 10));

Article[] results = gigaSpace.readMultiple(query);

Distributed Search

For partitioned spaces, use DistributedVectorSearch for global ranking:

import org.openspaces.vectorsearch.DistributedVectorSearch;

SQLQuery<Article> query =
    new SQLQuery<>(Article.class, "embedding vector:match ?");
query.setParameter(1, new VectorQuery(queryVector, 10));

// Executes on all partitions, merges results, returns top 10
List<Article> results = DistributedVectorSearch
    .search(gigaSpace, Article.class, query);

Always use DistributedVectorSearch for clustered or partitioned spaces to ensure globally correct ranking.

Advanced Features

Understanding k (Number of Results)

The k parameter in VectorQuery controls the maximum number of results returned:

// Returns at most 10 results
VectorQuery vq = new VectorQuery(queryVector, 10);

// The readMultiple limit (50) is ignored if k is smaller
Article[] results = gigaSpace.readMultiple(query, 50);
// Returns 10 results, not 50

The k parameter takes precedence over readMultiple() maxResults. This ensures the HNSW algorithm knows exactly how many neighbors to retrieve.

Handling Deleted Objects (Race Conditions)

Between index search and object fetch, entries may be deleted. DistributedVectorSearch automatically handles this with a backfill strategy:

  1. Requests k × 2 results from the index
  2. Fetches objects by UID
  3. If an object is deleted (null), fetches the next candidate
  4. Returns up to k available objects

Vector Dimensionality

Common embedding dimensions:

Model Dimension Use Case
all-MiniLM-L6-v2 384 Lightweight, fast embeddings
all-mpnet-base-v2 768 Balanced quality and performance
OpenAI text-embedding-3-small 1536 High-quality semantic embeddings
OpenAI text-embedding-3-large 3072 Top-tier semantic understanding

Index Storage

Vector indexes are stored using Apache Lucene with memory-mapped I/O (MMapDirectory):

Performance & Scalability

Performance Characteristics

Metric Value Notes
Search Complexity O(log n) average HNSW algorithm
Index Size ~4 bytes per dimension per entry 384-dim vectors ≈ 1.5 KB each
Network Transfer ~50–100 bytes per result UID + score only, not full object
Insertion O(log n) operations Batched auto-commits every 1000 changes

Scalability Tips

  • Use Partitioning: Split vectors across nodes for parallel search.
  • Optimize k: Return only what you need (k=10 is much faster than k=1000).
  • Batch Operations: Write vectors in batches to reduce index commits.
  • Monitor Index: Track uncommitted changes and trigger commits when needed.

Network Efficiency

For distributed search with k=10 and 3 partitions:

  • Network transfer: ~3 KB (30 ScoredResults × ~100 bytes)
  • vs. Sending full objects with vectors: ~45 KB
  • 93% reduction in network bandwidth

Common Use Cases

Recommendation Engine

Find similar products or articles based on user preferences:

// User liked this article; find similar ones
Article liked = gigaSpace.readById(Article.class, userId);
VectorQuery vq = new VectorQuery(liked.getEmbedding(), 5);
SQLQuery<Article> query =
    new SQLQuery<>(Article.class,
        "category = ? AND embedding vector:match ?");
query.setParameter(1, liked.getCategory());
query.setParameter(2, vq);

List<Article> recommendations =
    DistributedVectorSearch.search(gigaSpace, Article.class, query);

Semantic Search

Search by meaning, not keywords:

// Get embedding for user query (using LLM or embedding model)
String userQuery = "best machine learning frameworks";
float[] queryEmbedding = embeddingModel.embed(userQuery);

VectorQuery vq = new VectorQuery(queryEmbedding, 20);
SQLQuery<Article> query =
    new SQLQuery<>(Article.class, "embedding vector:match ?");
query.setParameter(1, vq);

List<Article> results =
    DistributedVectorSearch.search(gigaSpace, Article.class, query);

Duplicate Detection

Find near-duplicate documents:

// Find documents with >0.95 similarity
List<Article> candidates =
    DistributedVectorSearch.search(gigaSpace, Article.class, query);

for (Article candidate : candidates) {
    float similarity = computeCosineSimilarity(original.getEmbedding(),
                                              candidate.getEmbedding());
    if (similarity > 0.95f) {
        // Mark as potential duplicate
    }
}

Frequently Asked Questions

Do I need to normalize vectors?

Cosine similarity implicitly treats vectors as normalized. While XAP works with non-normalized vectors, it is best practice to normalize them before indexing for better performance and consistency with embedding model outputs.

What happens if I index vectors with wrong dimensions?

XAP validates dimensions at index time. If you try to index a vector with 768 dimensions into a field declared as dimension=384, you will get an IllegalArgumentException. Ensure your embedding model output matches the annotated dimension.

Can I search multiple vector fields?

XAP currently supports one vector field per query. For multiple semantic dimensions, either combine embeddings into a single vector or execute separate queries.

How do I update vectors?

Vectors are indexed automatically on write operations. Simply write a new version of the object with an updated vector:

Article updated = gigaSpace.readById(Article.class, "article-1");
updated.setEmbedding(newVector);
gigaSpace.write(updated);  // Vector index updated automatically

What about memory usage?

Vectors are stored in Lucene indexes on disk (with memory mapping), not in XAP's heap. One million 384-dimensional vectors requires roughly 1.5 GB on disk, with the working set in memory determined by the OS page cache.

Can I combine vector search with transactions?

Vector indexes are updated atomically with the main data. Reading from a vector index respects space snapshots, ensuring consistent results.

What is the difference between VectorQuery and DistributedVectorSearch?

  • VectorQuery — Low-level building block; searches the local partition's index.
  • DistributedVectorSearch — High-level API; executes on all partitions, merges results, and returns global top-k.

Use DistributedVectorSearch for partitioned spaces.

Can I filter results after vector search?

Yes. Use SQL predicates in the query itself for efficient pre-filtering:

"category = 'tech' AND publishDate > ? AND embedding vector:match ?"

Filtering in the query is more efficient than fetching all results and filtering in application code.