GigaSpaces Data Modeling
What Kinds of Data Stores are Available in GigaSpaces?
POJO is available, but in Smart DIH, by default, only Space Document is used.
When a Data Store object is defined in GigaSpaces, the object can be defined as a Space Object or a Space Document.
Within each object type, the data schema can be defined with:
- Fixed or static properties (columns), sometimes referred to as schema on write,
- Dynamic properties, sometimes referred to as schema on read, or
- Hybrid, a combination of both fixed and dynamic properties.
The choice of object type and schema definition can have a profound impact on an application's memory footprint and processing speed.
Deciding on the Type of Data Store
The choice of Space Object or Space Document depends on the planned use and variability of the data in the object.
A Space Object is generally used when most of the data in the data store has the same format, and optimum read/write speed is required. A Space Object generally has a simpler schema that is streamlined for fastest possible access.
A Space Document is indicated when different object instances (rows) within the data store can have varying types of data. A Space Document provides for the maximum amount of flexibility, at the cost of some additional time for data access.
Deciding on the Type of Schema
The type of data schema is a fundamental property of the Data Store and will impact the speed and flexibility of the data .
A schema definition can be changed. See Data Types, Schema Types and Schema Evolution for a discussion of the various options to change a schema.
Fixed Schema
As the name implies, a fixed schema has fields with compile-time data definitions. Fixed schema data objects are also referred to as schema-on-write objects. Although the data definitions themselves may offer some flexibility (such as a variable-length string), the overall schema is fixed when the object is created.
A fixed schema may be appropriate when the data in each instance of an object (like each row in a table) will have the same layout.
A fixed schema provides the fastest possible data access, and minimal memory footprint.
Dynamic Schema
A dynamic schema allows the greatest amount of flexibility for field definitions. Dynamic schema data objects are also referred to as schema-on-read objects.
Each object instance (row) of data with a dynamic schema can have properties (columns) with different field definitions.
Hybrid schema
A hybrid schema, as the name implies, is either a Space Object with some dynamic properties (e.g. JSON based); or a Space Document with some fixed properties. In both cases, the object has a mix of both fixed and dynamic properties.
GigaSpaces Data Modeling for objects that are used to interact with the Space has the following features:
- Space Object ID - When a new object is inserted into the Space, it embeds a unique ID (called the UID). The UID can be generated explicitly by the client using a unique value generated by the application business logic, or using a sequencer running within the Space.
- Annotation-based metadata - The GigaSpaces API supports class and property decorations with POJOs. These can be specified via annotations on the Space class source itself. You can define common behavior for all class instances, and specific behavior for class fields.
- XML-based metadata - Class and property decorations for POJOs can be specified via an external XML file, accompanied with the class byte code files located within the JAR/WAR. You can define common behavior for all class instances, and specific behavior for class fields.
- Storage type - To reduce the memory footprint of the objects stored in a Space, different storage types can be defined for individual properties of a Space class. Object properties can be assigned a storage type decoration, which determines how it is serialized and stored in the Space.
- Custom Serialization - You can control the serialization of embedded properties.
- Routing property - A partitioned Space enables performing Space operations against multiple Spaces transparently from a single proxy. The primary goal of the partitioned Space is to provide unlimited in-memory Space storage size, and group objects into the same partition to speed up performance. The initial intention is to write data into the partitioned Space, and route query operations based on the template data. In order to do this, a routing property can be defined on the entry type.