Indexing
When a Space Where GigaSpaces data is stored. It is the logical cache that holds data objects in memory and might also hold them in layered in tiering. Data is hosted from multiple SoRs, consolidated as a unified data model. is looking for a match for a read or take operation, it iterates over non-null values in the template, looking for matches in the Space. This process can be time consuming, especially when there are many potential matches. To improve performance, it is possible to index one or more properties. The Space maintains additional data for indexed properties, which shortens the time required to determine a match, thus improving performance.
Choosing which Properties to Index
One might wonder why properties are not always indexed, or why all the properties in all the classes are not always indexed. The reason is that indexing has its downsides as well:
- An indexed property can speed up read/take operations, but might also slow down write/update operations.
- An indexed property consumes more resources, specifically memory footprint per entry.
Optimizing Index Selection for Multiple Condition Queries
In-memory query execution will use one index to go over the data and will filter all the other conditions.
-
If there are several equal indexes available, the one with the smallest size will be used.
-
In the absence an equal index indexes, the ordered index will be used.
Since the size of an ordered index cannot be determined beforehand, its size will appear as unknown in the Explain Plan This is a quick, simple lightweight SQL tool that shows you a compiled plan in tabular form without executing it. It is a tool or function provided by the DBMS (Database Management System) that makes an execution plan visible. This will show index usage and how the scans of the exection will appear.
A compound index is counted as one index and will be used when all its components appear in the condition.
When to Use Indexing
Naturally the question arises of when to use indexing. Usually it is recommended to index properties that are used in common queries. However, in some scenarios one might favor less footprint, or faster performance for a specific query, and adding/removing an index should be considered.
Keep in mind that "premature optimization is the root of all evil". It is always recommended to benchmark your code to get better results.
Dynamic Indexing
Indexes can be added dynamically during runtime with the GigaSpaceTypeManager
interface. This doesn't lock any of the CRUD Create, Read, Update, Delete.
These terms describe the four essential operations for creating and managing persistent data elements, mainly in relational and NoSQL databases. operations, so system performance is not affected.
Refer to Data Type Metadata for more information.
Performance Tips
Properties that are not indexed and not used for queries can be grouped within a user-defined class (also known as payload class). This improves the read/write performance because these properties aren't introduced to the Space class model.
Deprecated Indexing Options
Implicit Indexing
If no properties are indexed explicitly, the Space implicitly indexes the first n properties (in alphabetical order), where n is determined by the number-implicit-indexes
property in the Space schema.
Using this feature is not recommended, because adding/removing properties can have unexpected side effects. It is deprecated, and might be removed in future versions.
Query Execution Flow
When a read
, take
, readMultiple
, or takeMultiple
call is performed, a template is used to locate matching Space objects. The template may have multiple field values - some may include values and some may not (i.e. null
field values acting as wildcard). The fields that do not include values are ignored during the matching process. In addition, some class fields may be indexed and some may not be indexed.
When multiple class fields are indexed, the Space looks for the field value index that includes the smallest amount of matching Space objects with the corresponding template field value as the index key.
The smallest set of Space objects is the list of objects to perform the matching against (matching candidates). After the candidate Space object list has been constructed, it is scanned to locate Space objects that fully match the given template - i.e. all non-null template fields match the corresponding Space object fields.
Class fields that are not indexed are not used to construct the candidate list.
Limitations
Indexing doesn't work in combination with SQL functions. If you apply a SQL function on an indexed field, the query will disregard the indexing (the results will return at the same speed as querying an unindexed field).
This topic includes the following sections explaining Indexing.