XAP

Directory Structure

The Processing Unit JAR File

Much like a JEE web application or an OSGi bundle, The Processing UnitClosed This is the unit of packaging and deployment in the GigaSpaces Data Grid, and is essentially the main GigaSpaces service. The Processing Unit (PU) itself is typically deployed onto the Service Grid. When a Processing Unit is deployed, a Processing Unit instance is the actual runtime entity. is packaged as a .jar file and follows a certain directory structure which enables the GigaSpaces runtime environment to easily locate the deployment descriptor and load its classes and the libraries it depends on. A typical processing unit looks as follows:

|----META-INF
|--------spring
|------------pu.xml
|------------pu.properties
|------------sla.xml
|--------MANIFEST.MF
|----com
|--------mycompany
|------------myproject
|----------------MyClass1.class
|----------------MyClass2.class
|----lib
|--------hibernate3.jar
|--------....
|--------commons-math.jar

The Processing Unit JAR file is composed of several key elements:

SLA definitions are only enforced when deploying the Processing Unit to the service grid, since this environment actively manages and controls the deployment using the GSM. When running within your IDE or in standalone mode these definitions are ignored.

  • META-INF/spring/pu.properties (not mandatory): Enables you to externalize properties included in the pu.xml file (e.g. database connection username and password), and also set system-level deployment properties and overrides, such as JEE related deployment properties (see this page for more details) or space properties (when defining a space inside your Processing Unit). Note, the pu.properties can also be placed at the root of the Processing Unit.

  • User class files: Your Processing Unit's classes (here under the com.mycompany.myproject package)

  • lib: Other JARs on which your Processing Unit depends, e.g. commons-math.jar or JARs that contain common classes across many Processing Units.

  • META-INF/MANIFEST.MF (not mandatory): This file could be used for adding additional JARs to the Processing Unit classpath, using the standard MANIFEST.MF Class-Path property. (see Manifest Based Classpath for more details)

You may add your own JARs into the runtime (GSCClosed Grid Service Container. This provides an isolated runtime for one (or more) processing unit (PU) instance and exposes its state to the GSM.) classpath by using the PRE_CLASSPATH and POST_CLASSPATH variables. These should point to your application JARs.

Sharing Libraries Between Multiple Processing Units

In some cases, multiple Processing Units use the same JAR files. In such cases it makes sense to place these JAR files in a central location accessible by all the Processing Units rather than packaging them individually with each of the Processing Units. Note that this is also useful for decreasing the deployment time in case your Processing Units contain a lot of 3rd party JARs files, since it saves a lot of the network overhead associated with downloading these JARs to each of the GSCs.

There are three options to achieve this:

lib/optional/pu-common Directory

JAR files placed in the $GS_HOME/lib/optional/pu-common directory will be loaded by each Processing Unit instance in its own separate classloader (called the Service Classloader, see the section below).

This means they are not shared between Processing Units on the same JVMClosed Java Virtual Machine. A virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode., which provides an isolation quality often required for JARs containing the application's proprietary business-logic. On the other hand this option consumes more PermGen memory (due to potentially multiple instances per JVM).

You can place these JARs in each GigaSpaces installation in your network, but it is more common to share this folder on your network and point the pu-common directory to the shared location by specifying this location in the com.gs.pu-common system property in each of the GSCs on your network.

When a new JAR needs to be loaded, just place the new JAR in pu-common directory and restart the Processing Unit.

If different Processing Units use different versions of the same JAR (under same JAR file name) then pu-common should not be used.

This is a suggested place to add any 3rd party dependencies that are used by your POJOs as this directory is added to the classpath of the UI tools.

META-INF/MANIFEST.MF Descriptor

JAR files specified in the Processing Unit's META-INF/MANIFEST.MF descriptor file will be loaded by each Processing Unit instance in its own separate classloader (called the Service Classloader, see the Class Loaders section below.

This option achieves similar behavior to the lib/optional/pu-common option above, but allows a more fine-grained control by enabling to specify specific JAR files (each in its own location) rather than an entire folder (and only a single folder).

For more information see Manifest Based Classpath section below.

lib/platform/ext Directory

JAR files placed in the $GS_HOME/lib/platform/ext directory will be loaded once by the GSC-wide classloader and not separately by each Processing Unit instance (this classloader is called the Common Classloader, see the Class Loaders section below).

This means they are shared between Processing Units on the same JVM and thereby offer no isolation. On the other hand this option consumes less PermGen memory (one instance per JVM).

This method is recommended for 3rd party libraries that have no requirement for isolation or different versions for different Processing Units, and are upgraded rather infrequently, such as JDBCClosed Java DataBase Connectivity. This is an application programming interface (API) for the Java programming language, which defines how a client may access a database. driver.

You can place these JARs in each GigaSpaces installation in your network, but it is more common to share this folder on your network and point the platform/lib/ext directory to the shared location on your network by specifying this location in the com.gigaspaces.lib.platform.ext system property in each of the GSCs on your network.

When a new JAR needs to be loaded, place the new JAR in lib/platform/ext directory and restart the relevant GSCs (on which an instance of the Processing Unit was running).

Considerations

When it comes to choosing the right option for your system, the following should be considered:

  • Size of loaded classes in memory (PermGen)
  • Size of Processing Unit JAR file and Processing Unit deployment time
  • Isolation (sharing classes between Processing Units)
  • Frequency of updating the library JAR
  • In addition special attention is required to xml parsing related JARs that have parllels in jdk itself,If your Processing Unit requires use of one of those JARS, you should place ALL related JARs in lib/platform/ext starting with 10.1 version the product dosn't include xml parsing JARs under lib/platform/xml and use default jdk JARs.

Runtime Modes

The Processing Unit can run in multiple modes.

When deployed on to the GigaSpaces runtime environment or when running in standalone mode, all the JARs under the lib directory of your Processing Unit JAR, will be automatically added to the Processing Unit's classpath.

When running within your IDE, it is similar to any other Java application, i.e. you should make sure all the dependent JARs are part of your project classpath.

Deploying the Processing Unit to the Service Grid

When deploying the Processing Unit to the Service Grid, the Processing Unit JAR file is uploaded to the Manager (GSM) and extracted to the deploy directory of the local GigaSpaces installation (located by default under $GS_HOME/deploy).

Once extracted, the GSM processes the deployment descriptor and based on that provisions Processing Unit instances to the running Containers.

Each GSC to which a certain instance was provisioned, downloads the Processing Unit JAR file from the GSMClosed Grid Service Manager. This is is a service grid component that manages a set of Grid Service Containers (GSCs). A GSM has an API for deploying/undeploying Processing Units. When a GSM is instructed to deploy a Processing Unit, it finds an appropriate, available GSC and tells that GSC to run an instance of that Processing Unit. It then continuously monitors that Processing Unit instance to verify that it is alive, and that the SLA is not breached., extracts it to its local work directory (located by default under $GS_HOME/work/deployed-processing-units) and starts the Processing Unit instance.

Deploying Data-Only Processing Units

In some cases, your Processing Unit contains only a Space and no custom code.

One way to package such Processing Unit is to use the standard Processing Unit packaging described above, and create a Processing Unit JAR file which only includes a deployment descriptor with the required space definitions and SLA.

GigaSpaces also provides a simpler option via its built-in data-only Processing Unit templates (located under $GS_HOME/deploy/templates/datagrid. Using these templates you can deploy and run data only Processing Unit without creating a dedicated JAR for them.

For more information please refer to Deploying and running the Processing Unit

Class Loaders

In general, classloaders are created dynamically when deploying a Processing Unit into a GSC. You should not add your classes into the GSC CLASSPATH. Classes are loaded dynamically into the generated classloader in the following cases:

  • When the GSM sending classes into the GSC when the application deployed and when GSC is restarted.
  • When the GSM sending classes into the GSC when the application scales.
  • When a Task class or Distributed Task class and its dependencies are executed (space execute operation).
  • When space domain classes and their dependencies (Data model) are used (space write/read operations)

Here is the structure of the class loaders when several Processing Units are deployed on the Service GridClosed A built-in orchestration tool which contains a set of Grid Service Containers (GSCs) managed by a Grid Service Manager. The containers host various deployments of Processing Units and data grids. Each container can be run on a separate physical machine. This orchestration is available for XAP only. (GSC):

Bootstrap (Java)
                  |
               System (Java)
                  |
               Common (Service Grid)
             /        \
    Service CL1     Service CL2

The following table shows which user controlled locations end up in which class loader, and the important JAR files that exist within each one:

Class Loader User Locations Built in JAR Files
Common [GSRoot]/lib/platform/ext/*.jar xap-datagrid.jar
Processing Unit Instance (Service Class Loader) [PU], [PU]/lib/*.jar, [PU]/META-INF/MANIFEST.MF Class-Path Entry, [GSRoot]/lib/optional/pu-common/*.jar xap-openspaces.jar, org.springframework*.jar

In terms of class loader delegation model, the service (Processing Unit instance) class loader uses a parent last delegation mode. This means that the Processing Unit instance class loader will first try and load classes from its own class loader, and only if they are not found, will delegate up to the parent class loader.

Native Library Usage

When deploying applications using native libraries you should place the Java libraries (JAR files) loading the native libraries under the GSRoot/lib/platform/ext folder. This will load the native libraries once into the common class loader.

Permanent Generation Space

For applications that are using relatively large amount of third party libraries (Processing Unit using large amount of JARs) the default permanent generation space size may not be adequate. In such a case, you should increase the permanent generation space size. Here are suggested values:

-XX:PermSize=512m -XX:MaxPermSize=512m

Manifest Based Classpath

You may add additional JARs to the Processing Unit classpath by having a manifest file located at META-INF/MANIFEST.MF and defining the property Class-Path as shown in the following example (using a simple MANIFEST.MF file):

Manifest-Version: 1.0
Class-Path: /home/user1/java/libs/user-lib.jar
 lib/platform/jdbc/hsqldb.jar
 ${MY_LIBS_DIRECTORY}/user-lib2.jar
 file:/home/user2/libs/lib.jar

[REQUIRED EMPTY NEW LINE AT EOF]

In the previous example, the Class-Path property contains 4 different entries:

  1. /home/user1/java/libs/user-lib.jar - This entry uses an absolute path and will be resolved as such.
  2. lib/optional/jdbc/hsqldb-2.7.1.jar - This entry uses a relative path and as such its path is resolved in relative to the gigaspaces home directory.
  3. ${MY_LIBS_DIRECTORY}/user-lib2.jar - In this entry the ${MY_LIBS_DIRECTORY} will be resolved if an environment variable named MY_LIBS_DIRECTORY exists, and will be expanded appropriately.
  4. file:/home/user2/libs/lib.jar - This entry uses URL syntax

The pu-common Directory

The pu-common directory may contain a JAR file with a manifest file as described above located at META-INF/MANIFEST.MF. The classpath defined in this manifest will be shared by all Processing Units as described in Sharing libraries.

Further Details

  1. If an entry points to a non existing location, it will be ignored.
  2. If an entry included the ${SOME_ENV_VAL} placeholder and there is no environment variable named SOME_ENV_VAL, it will be ignored.
  3. Only file URLs are supported. (i.e http, etc... will be ignored)

Further details about the manifest file can be found here.