Monitoring

Implementing reliable monitoring functionality to track GigaSpaces, GigaSpaces, and the environment where they are deployed is an important task that should be completed before moving into production. Correctly monitoring the GigaSpaces/GigaSpaces environment enables creating a proactive action plan (manually or automatically) that can be triggered before any system failure . This helps to avoid bad user experience, data loss, and abnormal sudden system shutdown. For example, effective monitoring can identify increased use of system resources so you can allocate additional CPU or memory, or malfunctioning components that must be corrected before they impact system health or correct system behavior.

Monitoring functionality should track the following:

Service Grid A built-in orchestration tool which contains a set of Grid Service Containers (GSCs) managed by a Grid Service Manager. The containers host various deployments of Processing Units and data grids. Each container can be run on a separate physical machine. This orchestration is available for XAP only. statistics
Data Grid statistics
Event Containers (Polling/Notify/Archiver) statistics
Remote Service statistics
Remote communication statistics
Client Local cache/view statistics
Web application statistics
Mirror Performs the replication of changes to the target table or accumulation of source table changes used to replicate changes to the target table at a later time. If you have implemented bidirectional replication in your environment, mirroring can occur to and from both the source and target tables. Service statistics
Replication statistics
Admin Alerts (CPU Utilization, Garbage Collection, Replication Channel Disconnection, etc.)
Log files

For all service grid components (GSA Grid Service Agent. This is a process manager that can spawn and manage Service Grid processes (Operating System level processes) such as The Grid Service Manager, The Grid Service Container, and The Lookup Service. Typically, the GSA is started with the hosting machine's startup. Using the agent, you can bootstrap the entire cluster very easily, and start and stop additional GSCs, GSMs and lookup services at will., LUS Lookup Service. This service provides a mechanism for services to discover each other. Each service can query the lookup service for other services, and register itself in the lookup service so other services may find it., GSM Grid Service Manager. This is is a service grid component that manages a set of Grid Service Containers (GSCs). A GSM has an API for deploying/undeploying Processing Units. When a GSM is instructed to deploy a Processing Unit, it finds an appropriate, available GSC and tells that GSC to run an instance of that Processing Unit. It then continuously monitors that Processing Unit instance to verify that it is alive, and that the SLA is not breached., GSC Grid Service Container. This provides an isolated runtime for one (or more) processing unit (PU) instance and exposes its state to the GSM.) you should monitor thread count, CPU utilization, file descriptor count, memory utilization, and network utilization.

You can publish this information in real time to any enterprise monitoring system (CA Wily introscope, HP Operations Manager, IBM Tivoli, etc.) to be correlated with your existing application monitoring. You can also generate daily/hourly reports of this information to be shared with relevant entities within your organization, which can be processed offline to estimate required system capacity and size upon system growth.

Consult the GigaSpaces support team for information about monitoring tools that are provided as part of GigaSpaces' professional services. These can be adapted to fit your exact requirements.