Hot Deploy Utility

Download
Github link

This tool allows business logic running as a processing unitClosed This is the unit of packaging and deployment in the GigaSpaces Data Grid, and is essentially the main GigaSpaces service. The Processing Unit (PU) itself is typically deployed onto the Service Grid. When a Processing Unit is deployed, a Processing Unit instance is the actual runtime entity. to be refreshed (rolling PUClosed This is the unit of packaging and deployment in the GigaSpaces Data Grid, and is essentially the main GigaSpaces service. The Processing Unit (PU) itself is typically deployed onto the Service Grid. When a Processing Unit is deployed, a Processing Unit instance is the actual runtime entity. upgrade) without any system downtime and data loss. The tool uses the hot deploy approach, placing new PU code on the GSMClosed Grid Service Manager. This is is a service grid component that manages a set of Grid Service Containers (GSCs). A GSM has an API for deploying/undeploying Processing Units. When a GSM is instructed to deploy a Processing Unit, it finds an appropriate, available GSC and tells that GSC to run an instance of that Processing Unit. It then continuously monitors that Processing Unit instance to verify that it is alive, and that the SLA is not breached. PU deploy folder and later restarting each PU instance.

PU Restart

To refresh the PU code, the tool restarts all processing units for a given PU with these steps.

  1. Old deployment files for the specified PU are moved into a temporary folder to be used in case the upgrade fails.
  2. New PU files are copied to the deploy folder prior to the restart phase.
  3. The tool identifies all running PU instances and restarts them one by one, in either stateful or stateless mode.

Once the process is completed, both the primary and backup PU instances will run with the updated logic.

 Stateful Restart

The tool performs these steps for a stateful restart:

  1. Discovers all PU instances and identifies their SpaceClosed Where GigaSpaces data is stored. It is the logical cache that holds data objects in memory and might also hold them in layered in tiering. Data is hosted from multiple SoRs, consolidated as a unified data model. mode.
  2. Restarts all backups (each instance in a separate thread).
  3. Restarts all primaries. If the double_restart option is enabled, primaries are restarted twice to return to the original state (one by one). Without this option, primary partitions are restarted one time (each instance in a separate thread). Use double_restart if all instances should be placed in the “original” vm.

Stateless Restart

For a stateless restart the tool discover all PU instances and restarts them (each instance in a separate thread).

 Build

To build the tool:

  1. Download the source files (xap-hot-deploy-master folder) from the repository. They can be downloaded to any location on your machine.
  2. Enter the command:
mvn clean install

Note, that tests will be skipped in this case. To build with tests see the tests section.

 Run Hot Deploy

To run hot deploy:

  1. Copy new jar(war) files with new classes to the xap-hot-redeploy folder.
  2. Configure options in the xap-hot-deploy-master/config.properties file.
  3. Run the following script from the xap-hot-deploy-master folder.

 

 run
 run.sh

Runtime configuration

The following options can be configured:

Option Optional/required Default value Description
GSM_HOSTS required - Hosts on which GSM are located.
PU required space=space.jar, web=web.war Map with key value pairs, where key is processing unit name and value is the name of the file with new classes.
SSH_USER required user Name of user on remote machine.
GS_HOME_DIR required - Path to gigaspaces directory.
LOOKUPLOCATORS optional localhost Jini lookup service locators used for unicast discovery.
LOOKUPGROUPS optional Gigaspace default lookup group Jini lookup service group.
IDENT_PU_TIMEOUT required 60 Timeout to identify processing unit (in seconds).
IDENT_SPACE_MODE_TIMEOUT required 60 Timeout to identify space mode (in seconds).
IDENT_INSTANCES_TIMEOUT required 60 Timeout to identify instances (in seconds).
RESTART_TIMEOUT required 60 Timeout for restarting pu (in seconds).
IS_SECURED optional false Set this parameter "true" if space is secured.
DOUBLE_RESTART optional false Set "true" if all instances should be placed in "original" vm after redeploy. When set to "true" primary instances are restarted twice.
LOCAL_CLUSTER_MODE optional false Set "true" for local cluster mode (testing mode).

Results

If hot-redeploy runs successfully, you can see success message and details for restarting pu instances, as in this sample:

14:51:44,392  INFO main ConfigInitializer:init:28 - Gigaspaces location: /home/user/gigaspaces-xap-premium-10.0.0-ga
14:51:44,393  INFO main ConfigInitializer:init:29 - Pu to restart: [space, cinema, mirror]
14:51:44,393  INFO main ConfigInitializer:init:30 - Locator: null
14:51:44,393  INFO main ConfigInitializer:init:31 - Lookup group: null
14:51:44,394  INFO main ConfigInitializer:init:32 - Timeout for identify pu: 60
14:51:44,394  INFO main ConfigInitializer:init:33 - Timeout for identify instances: 60
14:51:44,394  INFO main ConfigInitializer:init:34 - Timeout for identify space mode: 60
14:51:44,395  INFO main ConfigInitializer:init:35 - Timeout for restart 60
14:51:44,395  INFO main ConfigInitializer:init:36 - Secured: false
14:51:44,395  INFO main ConfigInitializer:init:37 - Double restart: false
14:51:44,395  INFO main ConfigInitializer:init:38 - GSM Hosts: [127.0.0.1]
14:51:44,395  INFO main ConfigInitializer:init:39 - User: user
14:51:44,395  INFO main ConfigInitializer:init:40 - Is local cluster: false
14:51:52,044  INFO main StatefulPuRestarter:restartAllInstances:105 - Restarting pu space with type STATEFUL
14:51:52,045  INFO pool-6-thread-1 PuInstanceRestarter:restartPUInstance:36 - restarting instance 1 on 127.0.0.1[127.0.0.1] GSCClosed Grid Service Container.
This provides an isolated runtime for one (or more) processing unit (PU) instance and exposes its state to the GSM. PID:9214 mode:backup...
14:52:05,085  INFO pool-6-thread-1 PuInstanceRestarter:restartPUInstance:43 - done
14:52:06,233  INFO pool-7-thread-1 PuInstanceRestarter:restartPUInstance:36 - restarting instance 1 on 127.0.0.1[127.0.0.1] GSC PID:9213 mode:primary...
14:52:21,367  INFO pool-7-thread-1 PuInstanceRestarter:restartPUInstance:43 - done
14:52:22,433  INFO main StatelessPuRestarter:restart:23 - Restarting pu cinema with type WEB
14:52:31,107  INFO main StatelessPuRestarter:restart:25 - done
14:52:32,116  INFO main StatelessPuRestarter:restart:23 - Restarting pu mirror with type MIRROR
14:52:38,929  INFO main StatelessPuRestarter:restart:25 - done
14:52:28,945  INFO main HotRedeployMain:main:17 - Hot redeploy completed successfully
``

If there are any problems during the hot-redeploy, you will see an error message and description of the problem:


20:11:27,861  INFO main HotRedeployMain:checkFiles:76 - Please place new files on all GSM machines and try again.
20:11:27,864  INFO main HotRedeployMain:checkFiles:77 - Hot redeploy failed

You can see all details about the hot-redeploy process in the hot-redeploy.log file.

 Build with Tests

To build the tool with running tests, enter the command:

mvn clean install -DskipTests=false

 Prerequisites for running tests:

  • Run gs-agent.sh/bat.
  • Set lookup group and locator to default values.
  • Set properties in /tool/src/test/resources/config.properties file.
  • Make sure that there is no pu with name space deployed already.

Rollback

Rollback functionality helps to avoid the loss of data, if errors occurred during the redeploy (such as a broken pu file). When errors occur, the tool searches for backup GSM's. If there is more than one GSM in the system, they will be restarted one by one. If there is only one GSM in the system, the tool will look for an empty GSC and restart it. If the rollback finished successfully, all processing units for redeploy return to their original version, and you will see messages as in this sample.


17:03:48,679  INFO main StatefulPuRestarter:restartAllInstances:105 - Restarting pu space with type STATEFUL
17:03:48,681  INFO pool-6-thread-1 PuInstanceRestarter:restartPUInstance:36 - restarting instance 1 on 127.0.0.1[127.0.0.1] GSC PID:7612 mode:backup...
17:04:49,294  INFO pool-6-thread-1 PuInstanceRestarter:restartPUInstance:43 - done
17:10:35,739  INFO main RollbackChecker:doRollback:100 - Do rollback..
17:10:35,739  INFO main RollbackChecker:doRollback:106 - There is one GSM in system. Try to find empty GSC
17:10:35,740  INFO main RollbackChecker:doRollback:109 - Restarting GSC with id 2
17:10:53,683  INFO main RollbackChecker:doRollback:119 - Rollback completed successfully
17:10:53,684  WARN main HotRedeployMain:redeploy:44 - Hot redeploy failed. Rollback successfully completed

 Minimal configuration for rollback:

In order for the rollback to work, the following minimal topology needs to be available:

  • At least one backup GSM must be deployed.
  • If there are n primary pu instances, n + 1 GSCs must be deployed.

If no backup GSM and no empty container are found, the rollback will fail and the system will be in unstable state.