When a Space is looking for a match for a read or take operation, it iterates over non-null values in the template, looking for matches in the Space. This process can be time consuming, especially when there are many potential matches. To improve performance, it is possible to index one or more properties. The Space maintains additional data for indexed properties, which shortens the time required to determine a match, thus improving performance.
Choosing which Properties to Index
One might wonder why properties are not always indexed, or why all the properties in all the classes are not always indexed. The reason is that indexing has its downsides as well:
- An indexed property can speed up read/take operations, but might also slow down write/update operations.
- An indexed property consumes more resources, specifically memory footprint per entry.
When to Use Indexing
Naturally the question arises of when to use indexing. Usually it is recommended to index properties that are used in common queries. However, in some scenarios one might favor less footprint, or faster performance for a specific query, and adding/removing an index should be considered.
Keep in mind that “premature optimization is the root of all evil”. It is always recommended to benchmark your code to get better results.
Indexes can be added dynamically during runtime with the
GigaSpaceTypeManager interface. This doesn’t lock any of the CRUD operations, so system performance is not affected.
Refer to Data Type Metadata for more information.
Properties that are not indexed and not used for queries can be grouped within a user-defined class (also known as payload class). This improves the read/write performance because these properties aren’t introduced to the Space class model.
Deprecated Indexing Options
If no properties are indexed explicitly, the Space implicitly indexes the first n properties (in alphabetical order), where n is determined by the
number-implicit-indexes property in the Space schema.
Using this feature is not recommended, because adding/removing properties can have unexpected side effects. It is deprecated, and might be removed in future versions.
Query Execution Flow
takeMultiple call is performed, a template is used to locate matching Space objects. The template may have multiple field values - some may include values and some may not (i.e.
null field values acting as wildcard). The fields that do not include values are ignored during the matching process. In addition, some class fields may be indexed and some may not be indexed.
When multiple class fields are indexed, the Space looks for the field value index that includes the smallest amount of matching Space objects with the corresponding template field value as the index key.
The smallest set of Space objects is the list of objects to perform the matching against (matching candidates). After the candidate Space object list has been constructed, it is scanned to locate Space objects that fully match the given template - i.e. all non-null template fields match the corresponding Space object fields.
Class fields that are not indexed are not used to construct the candidate list.