Creating a Processing Unit

The PUClosed This is the unit of packaging and deployment in the GigaSpaces Data Grid, and is essentially the main GigaSpaces service. The Processing Unit (PU) itself is typically deployed onto the Service Grid. When a Processing Unit is deployed, a Processing Unit instance is the actual runtime entity. is the fundamental unit of deployment in data grid. The PU itself runs within a Processing UnitClosed This is the unit of packaging and deployment in the GigaSpaces Data Grid, and is essentially the main GigaSpaces service. The Processing Unit (PU) itself is typically deployed onto the Service Grid. When a Processing Unit is deployed, a Processing Unit instance is the actual runtime entity. Container and is deployed onto the Service GridClosed A built-in orchestration tool which contains a set of Grid Service Containers (GSCs) managed by a Grid Service Manager. The containers host various deployments of Processing Units and data grids. Each container can be run on a separate physical machine. This orchestration is available for Smart Cache only. For Smart DIH, we recommend using our Kubernetes orchestration.. Once a PU is deployed, a PU instance is the actual runtime entity.

There are two types of Processing Unit Containers:

Processing Unit (PU)

The PU is a deployable, independent, scalable unit, which is the building block for the SpaceClosed Where GigaSpaces data is stored. It is the logical cache that holds data objects in memory and might also hold them in layered in tiering. Data is hosted from multiple SoRs, consolidated as a unified data model. Based Architecture (SBAClosed Space-Based Architecture. This architecture implementation is a set of Processing Units, with the following properties: Each processing unit instances holds a partitioned space instance and one or more services that are registered on events on that specific partition. Together they form an application cluster. Utlized by Utilized GigaSpaces cloud-native IMDG.). The PU is a combination of service beans and/or an embedded space instance. The artifacts that belong to a PU are packaged as a JAR or WAR file.

There are several types of PUs; data only, business-logic only, mixed PUs (which contain both data and business logic) and special purpose PUs.

Data Only PU

This type of PU does not include any business logic, only a Space. The PU simply defines the runtime characteristics of the space, i.e. its runtime topology, the number of space replicas/partitions, etc.

Business Logic Only PU

The Business-logic Only PU implements your application code, and does not include any data. Typically, your code interacts with a remote Space which is defined by another PU. By defining the PU as business logic only, you create an application server which is hosted and monitored by the Service Grid. The application can be a typical Spring application deployed to a data grid PU.

Mixed PU

This type of PU's includes both business logic and a space. Typically, the business logic interacts with a local space instance (i.e. a data grid instance running within the same PU instance) to achieve lowest possible latency and best performance.

Web PU

Data grid allows you to deploy web applications (packaged as a WAR file) onto the Service Grid. The integration is built on top of the Service Grid Processing Unit Container. The web application itself is a pure, JEE based, web application. The application can be the most generic web application, and automatically make use of the Service Grid features. The web application can define a Space (either embedded or remote) very easily (either using Spring or not).The web container used behind the scenes is Jetty.

For more information, see the Web Application Support section in the developer guide.

Mule PU

Data grid's Mule integration allows you to run a pure Mule application (with or without data grid special extension points and transports) as a PU.

For more information, see the Mule ESB Integration section in the developer guide.

The PU JAR File

Much like a JEE web application or an OSGi bundle, The PU is packaged as a .jar file and follows a certain directory structure which enables the data grid runtime environment to easily locate the deployment descriptor and load its classes and the libraries it depends on. A typical PU looks as follows:

|----META-INF
|--------spring
|------------pu.xml
|------------pu.properties
|------------sla.xml
|--------MANIFEST.MF
|----xap
|--------tutorial
|------------model
|----------------Payment.class
|----------------User.class
|----lib
|--------hibernate16.4.0.jar
|--------....
|--------commons-math.jar

The PU JAR file is composed of several key elements:

  • META-INF/spring/pu.xml (mandatory): This is the PU's deployment descriptor, which is in fact a Spring context XML configuration with a number of data grid-specific namespace bindings. These bindings include data grid specific components (such as the space for example). The pu.xml file typically contains definitions of data grid components (space, event containers, remote service exporters) and user defined beans.

  • META-INF/spring/sla.xml (not mandatory): This file contains SLA definitions for the PU (i.e. number of instances, number of backup and deployment requirements). Note that this is optional, and can be replaced with an <os:sla> definition in the pu.xml file. If neither is present, the default SLA will be applied. SLA definitions can also be specified at the deploy time via command line arguments.

  • META-INF/spring/pu.properties (not mandatory): Enables you to externalize properties included in the pu.xml file (e.g. database connection username and password), and also set system-level deployment properties and overrides, such as JEE related deployment properties.

  • User class files: Your processing unit's classes (here under the xap.tutorial package)

  • lib: Other JARson which your PU depends.

  • META-INF/MANIFEST.MF (not mandatory): This file could be used for adding additional jars to the PU classpath, using the standard MANIFEST.MF Class-Path property.

The pu.xml file

This file is a Spring framework XML configuration file. It leverages the Spring framework IoC container and extends it by using the Spring custom namespace mechanism.

The definitions in the pu.xml file are divided into 2 major categories:

  • GigaSpaces specific components, such as space, event containers or remote service exporters.

  • User-defined beans, which define instances of user classes to be used by the PU. For example, user defined event handlers to which the event containers delegate events as those are received.

Here is an example of a pu.xml file:

<?xml version="1.0" encoding="UTF-8"?>
<!--
    top level element of the Spring configuration. Note the multiple namespace definition for both GigaSpaces and Spring.
-->
<beans xmlns="http://www.springframework.org/schema/beans"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xmlns:context="http://www.springframework.org/schema/context"
   xmlns:os-core="http://www.openspaces.org/schema/core"
   xmlns:os-events="http://www.openspaces.org/schema/events"
   xmlns:os-remoting="http://www.openspaces.org/schema/remoting"
   xmlns:os-sla="http://www.openspaces.org/schema/sla"
   xsi:schemaLocation="
   http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
   http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
   http://www.openspaces.org/schema/core http://www.openspaces.org/schema/16.4/core/openspaces-core.xsd
   http://www.openspaces.org/schema/events http://www.openspaces.org/schema/16.4/events/openspaces-events.xsd
   http://www.openspaces.org/schema/remoting http://www.openspaces.org/schema/16.4/remoting/openspaces-remoting.xsd
   http://www.openspaces.org/schema/sla http://www.openspaces.org/schema/16.4/sla/openspaces-sla.xsd">

    <!-- Enables to configure Spring beans through annotations   -->
    <context:annotation-config />

    <!-- Enable OpenSpaces core annotation support. -->
    <os-core:annotation-support />

    <!-- Enables using @Polling and @Notify annotations to creating polling and notify containers  -->
    <os-events:annotation-support />

    <!-- Enables using @RemotingService and other remoting related annotations   -->
    <os-remoting:annotation-support />

    <!--
        A bean representing a space. Here we configure an embedded space (note the url element which does
        not contain any remote protocol prefix. Also note that we do not specify here the cluster topology
        of the space. It is declared by the os-sla:sla element of this pu.xml file.
    -->
    <os-core:embedded-space id="space" space-name="eventSpace" />

    <!-- Define the GigaSpace instance that the application will use to access the space  -->
    <os-core:giga-space id="eventSpace" space="space"/>

</beans>

For more information, see the Configuration page in the Processing Unit section of the developer guide.

Service Level Agreement (SLA)

The SLA definitions can be provided as part of the PU package or during the PU's deployment process. They define the number of PU instances that should be running and deploy-time requirements such as clustering topology for PU's which contain a space. The GSMClosed Grid Service Manager. This is is a service grid component that manages a set of Grid Service Containers (GSCs). A GSM has an API for deploying/undeploying Processing Units. When a GSM is instructed to deploy a Processing Unit, it finds an appropriate, available GSC and tells that GSC to run an instance of that Processing Unit. It then continuously monitors that Processing Unit instance to verify that it is alive, and that the SLA is not breached. reads the SLA definition, and deploys the PU onto the available GSCs according to it. A sample SLA definition is shown below:

The number of backups per partition is zero or one.

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:os-sla="http://www.openspaces.org/schema/sla"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
       http://www.springframework.org/schema/beans/spring-beans.xsd
       http://www.openspaces.org/schema/sla http://www.openspaces.org/schema/16.4/sla/openspaces-sla.xsd">

      <os-sla:sla cluster-schema="partitioned"
            number-of-instances="2" number-of-backups="1"

The number of backups per partition is zero or one.


            max-instances-per-vm="1">
       </os-sla:sla>
</beans>

For more information, see the Service Level Agreement (SLA) section in the Administration guide.

Deployment

When deploying the PU to the GigaSpaces Service Grid, the PU JAR file is uploaded to the GigaSpaces Manager (GSM) and extracted to the deploy directory of the local GigaSpaces installation (located by default under $GS_HOME/deploy). Once extracted, the GSM processes the deployment descriptor and based on that provisions PU instances to the running data grid containers.

Each GSCClosed Grid Service Container. This provides an isolated runtime for one (or more) processing unit (PU) instance and exposes its state to the GSM. to which a certain instance was provisioned, downloads the PU JAR file from the GSM, extracts it to its local working directory (located by default under $GS_HOME/work/deployed-processing-units) and starts the PU instance.

Example

Our Online Payment system is expected to handle a large amount of concurrent users performing transactions. The system also needs to be highly available. This is where data grid's PU comes into play. We will create a polling container that takes a payment event as input and processes it. Then, we will deploy this code as a PU onto the data grid. Payment events are being written into a space and the polling container will pick up the events and process them. We will use the pu.xml file to define the deployment and add an SLA configuration to it to provide failover and scalability.

Polling Container

First we define a polling container that will handle the business logic upon receiving a payment event. In our example we define a polling container that will receive events when a new payment is created:

@EventDriven
@Polling
@NotifyType(write = true)
public class PaymentProcessor {

    // Define the event we are interested in
    @EventTemplate
    Payment unprocessedData() {
        Payment template = new Payment();
        template.setStatus(ETransactionStatus.NEW);
        return template;
    }

    @SpaceDataEvent
    public Payment eventListener(Payment event) {
        System.out.println("Payment received; processing .....");

        // set the status on the event and write it back into the space
        event.setStatus(ETransactionStatus.PROCESSED);
        return event;
    }
}

Creating the pu.xml

In this step will create the configuration file for the PU deployment.

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:context="http://www.springframework.org/schema/context"
    xmlns:os-core="http://www.openspaces.org/schema/core" xmlns:os-events="http://www.openspaces.org/schema/events"
    xmlns:os-remoting="http://www.openspaces.org/schema/remoting"
    xmlns:os-sla="http://www.openspaces.org/schema/sla"
    xsi:schemaLocation="
   http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
   http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
   http://www.openspaces.org/schema/core http://www.openspaces.org/schema/16.4/core/openspaces-core.xsd
   http://www.openspaces.org/schema/events http://www.openspaces.org/schema/16.4/events/openspaces-events.xsd
   http://www.openspaces.org/schema/remoting http://www.openspaces.org/schema/16.4/remoting/openspaces-remoting.xsd
   http://www.openspaces.org/schema/sla http://www.openspaces.org/schema/16.4/sla/openspaces-sla.xsd">

    <!-- Scan the packages for annotations / -->
    <context:component-scan base-package="xap" />

    <!-- Enables to configure Spring beans through annotations -->
    <context:annotation-config />

    <!-- Enable @PostPrimary and others annotation support. -->
    <os-core:annotation-support />

    <!-- Enables using @Polling and @Notify annotations to creating polling and notify containers -->
    <os-events:annotation-support />

    <!-- Enables using @RemotingService and other remoting related annotations -->
    <os-remoting:annotation-support />

    <!-- A bean representing a space (an IJSpace implementation) -->
    <os-core:embedded-space id="space" space-name="eventSpace" />

    <!-- Define the GigaSpace instance that the application will use to access the space -->
    <os-core:giga-space id="eventSpace" space="space"/>

</beans>

Deployment

Now we have all the pieces that are necessary to create the JAR file for the PU. After we have created the JAR file its time to deploy the PU onto the data grid. Again, you can do this in three ways; by script, Java code or via the Web Managment Console. In our example will use the scripts to deploy the PU.

First we start the GigaSpaces Agent (GSAClosed Grid Service Agent. This is a process manager that can spawn and manage Service Grid processes (Operating System level processes) such as The Grid Service Manager, The Grid Service Container, and The Lookup Service. Typically, the GSA is started with the hosting machine's startup. Using the agent, you can bootstrap the entire cluster very easily, and start and stop additional GSCs, GSMs and lookup services at will.) that will create our data grid on this machine:

And now we deploy the PU onto the data grid:

GS_HOME\bin\gs.sh deploy  eventPU.jar
curl -X POST --header 'Content-Type: application/json' --header 'Accept: text/plain' -d '{ 
   "name": "eventPU", 
   "resource": "...path..to\eventPU.jar"  
 }' 'http://localhost:8090/v1/deployments'

We assume that the JAR ar we created is named eventPU.jar.

If you start the Web Management Console, you will be able to see that through the deployment a space called eventSpace was created and a PU named with the JAR name.

Client Interface

Now its time to create a client that creates events and writes them into the space. We will attach a listener on the client side to the space that will receive events when the payment is processed.

@EventDriven
@Polling
@NotifyType(write = true)
public class ClientListener {

    // Define the event we are interested in
    @EventTemplate
    Payment unprocessedData() {
        Payment template = new Payment();
        template.setStatus(ETransactionStatus.PROCESSED);
        return template;
    }

    @SpaceDataEvent
    public Payment eventListener(Payment event) {
        System.out.println("Processed Payment received ");

        return null;
    }
}
public void postPayment() {
    // Register the event handler on the Space
    this.registerPollingListener();

    // Create a payment
    Payment payment = new Payment();
    payment.setCreatedDate(new Date(System.currentTimeMillis()));
    payment.setPayingAccountId(new Integer(1));
    payment.setPaymentAmount(new Double(120.70));

    // write the payment into the spaceO
    space.write(payment);
}
public void registerPollingListener() {
     Payment payment = new Payment();
     payment.setStatus(ETransactionStatus.PROCESSED);

     SimplePollingEventListenerContainer pollingEventListenerContainer = new SimplePollingContainerConfigurer(
         space).eventListenerAnnotation(new ClientListener())
         .pollingContainer();
     pollingEventListenerContainer.start();
}

When you run this code you should see that the PU deployed onto the IMDGClosed In-Memory Data Grid. A set of Space instances, typically running within their respective processing unit instances. The space instances are connected to each other to form a space cluster. The relations between the spaces define the data grid topology. Also known as Enterprise Data Grid - EDG is processing the event, changes the status of the payment to PROCESSED and writes the event back into the space. The client then will receive an event because it has registered a listener that listens for processed payment events.

Deploy a PU with the Web Management Console

There is complete example available of a PU on GitHub. You can download, build and deploy this example. Here is an example how you deploy a PU with the Web Management Console.

Deploy PU

Applications deployed

Data Grid

Statistics

Failover and Scalability

One of our non functional requirements for our online payment system is that it is highly available and it can handle a large amount of concurrent transactions. This can be accomplish in a couple of ways. We can deploy the PU with multiple concurrent threads and or multiple PU instances on top of the grid.

Multi-Threaded PU

By default the PU is single threaded. With a simple annotation you can tell data grid how many threads the PU should run with.

@EventDriven
@Polling @Polling(concurrentConsumers = 3, maxConcurrentConsumers = 10)
@NotifyType(write = true)
public class PaymentProcessor {
}

Multiple PUs

Lets assume that we have two machines available for our deployment. We want to deploy 4 instances of our PU, two on each machine.

The deployment script for this scenario looks like this:


With a statefull PU, embedded space
./gs.sh deploy -cluster schema=partitioned total_members=4,0 -max-instances-per-machine 2 eventPU.jar

With a stateless PU
./gs.sh deploy -cluster total_members=4 -max-instances-per-machine 2 eventPU.jar

For more information, see the Deploy Command Line Interface topic in the Administration guide.