Deploying a Processing Unit in Kubernetes

A Processing Unit is a container that can hold any of the following:

  • Data only (a Space)
  • Function only (business logic)
  • Both data and a function

You can use the event-processing example available with the XAP and InsightEdge software packages to see how data is fed to the function and processed in Processing Units. The example creates the following modules:

  • Processor - a Processing Unit with the main task of processing unprocessed data objects. The processing of data objects is accomplished using both an event container and remoting.
  • Feeder - a Processing Unit that contains two feeders, a standard Space feeder and a JMS feeder, to feed unprocessed data objects that are in turn processed by the processor module. The standard Space feeder feeds unprocessed data objects by both directly writing them to the Space and using OpenSpaces Remoting. The JMS feeder uses the JMS API to feed unprocessed data objects using a MessageConverter, which converts JMS ObjectMessages into data objects.

As a prerequisite for running this example, you must install Maven on the machine where you unpacked the GigaSpaces software package.

To build and deploy the event-processing example in Kubernetes, the following steps are required:

  1. Build the sample Processing Units from the GigaSpaces software package.
  2. Uploading the Processing Unit files for deployment.
  3. Deploy a Platform Manager (Management Pod).
  4. Deploy the Processing Units that were created when you built the example to Data Pods in Kubernetes, connecting them to the Management Pod.
  5. View the processor logs to see the data processing results.

Building the Processing Unit Example

The first step in deploying the sample Processing Units to Kubernetes is to build them from the examples directory. The example uses Maven as its build tool, and comes with a build script that runs Maven automatically.

Open a command window and navigate to the following folder in the XAP or InsightEdge package:

cd <product home>/examples/data-app/event-processing/

Type the following command (for Unix environments) to build the processor and feeder Processing Units:

./build.sh package

This build script finalizes the Processing Unit structure of both the processor and the feeder, and copies the processor JAR file to /examples/data-app/event-processing/processor/target/data-processor/lib, making the /examples/data-app/event-processing/processor/target/data-processor/ a ready-to-use Processing Unit. The final result is two Processing Unit JAR files, one under processor/target and another under feeder/target.

Uploading the Processing Unit Files

In order to deploy the Processing Units on Kubernetes, a URL must be provided. You can use an existing HTTP server, (for example, a local HTTP server using Helm), or you can use the GigaSpaces CLI (or REST API) to upload the Processing Unit files to the Manager Pod.

Ensure that your Kubernetes environment has access to the URL that you provide.

Use one of the following options to upload the Processing Unit files for deployment.

The upload stage does not provide high availability. The Processing Unit files are uploaded only to the active Manager Pod, and is not replicated to other managers. High availability only takes effect after the Processing Unit has been deployed.

Command:

./<XAP-HOME>/bin/gs.sh  pu upload 
<XAP-HOME>\bin\gs  pu upload 

Description:

Upload a Processing Unit to the service grid.

Parameters and Options:

Item Name Description
Parameter file Path to the Processing Unit file (.jar or .zip).
Option --url-only Return only the processing unit URL after uploading

Input Example:

This example uploads a PU named myPu to the mypu.jar file.

.<XAP-HOME>/bin/<XAP-HOME>/bin/gs.sh  pu upload mypu.jar
<XAP-HOME>\bin\<XAP-HOME>\bin\gs  pu upload mypu.jar

Path

PUT /pus/resources

Description:

Upload a Processing Unit to the service grid.

Example:

curl -X PUT --header 'Content-Type: multipart/form-data'
--header 'Accept: text/plain' {"type":"formData"} 'http://localhost:8090/v2/pus/resources'

Leave this command window open so the server remains available and Kubernetes can connect to it.

Deploying the GigaSpaces Components

Similar to deploying a Space cluster, it is best practice to first deploy the Management Pod (with the Platform Manager), and then deploy the Data Pods (first the processor, then the feeder).

Open a new command window and navigate to the charts directory (where you fetched the charts from our repo).

As was done for the Space demo, type the following Helm command to deploy a Management Pod called testmanager:

helm install insightedge-manager --name testmanager 

Next, type the following Helm command to deploy a Data Pod with the processor Processing Unit from the location where it was built in the examples directory:

helm install insightedge-pu --name processor --set manager.name=testmanager,resourceUrl=http://192.168.33.16:8877/examples/data-app/event-processing/processor/target/data-processor.jar

Lastly, type the following Helm command to deploy a Data Pod with the feeder Processing Unit from the same directory:

helm install insightedge-pu --name feeder --set manager.name=testmanager,resourceUrl=http://192.168.33.16:8877/examples/data-app/event-processing/feeder/target/data-feeder.jar

Monitoring the Processing Units

You can use one of the Kubernetes tools to view the logs for the processor Data Pod, where you can see that the sample data has been processed.

Configuring the Container Memory Allocation

The Docker container is always allocated an absolute amount of memory. If this is undefined in the Helm chart, the container will use as much as is necessary to accomodate the data and processes it contains. You can limit the memory allocation for the contents of the Docker container (Data Pod, Manager Pod, processes, etc.) and the heap memory.

The on-heap memory allocation can be defined as any of the following:

  • A positive absolute value for the heap memory.
  • A negative absolute value for the heap memory, calculating the heap size as ([total allocated container resources] - [XMib]).
  • A percentage of the Docker container.

The following Helm command allocates the amount of memory for both the Docker container and for the on-heap memory as an absolute value:

helm install insightedge --name test --set pu.resources.limits.memory=512Mi,pu.java.heap=256m

The following Helm commands allocates the amount of memory for the Docker container, and sets aside a specific amount of memory for the container to use. The rest of the memory is available to the Java heap.

helm install insightedge --name test --set pu.resources.limits.memory=512Mi,pu.java.heap=limit-150m

You can define the maximum size of the Docker container as an absolute value, and the maximum on-heap memory allocation for the Java running inside the Docker container as a percentage. If you use this approach, make sure you leave enough memory for the Java.

The following Helm command sets an absolute value for the Docker container, and defines the maximum Java on-heap memory as a percentage of the container memory:

helm install insightedge --name test --set pu.resources.limits.memory=256Mi,pu.java.heap=75%

Configuring the Data Grid Using the Helm Chart

Default Helm Chart

The InsightEdge Helm chart has a list of supported values that can be configured. To view this list, use the following Helm command:

helm inspect insightedge

The values.yaml file is printed in the command window, and each configurable value has a short explanation above it. The indentation in this printout indicates a use of a ".' (dot) in the value name. For example, the high availability property for the Platform Manager is listed as follows in the file:

manager:
ha: false

The value you will set will look like this in the command window: manager.ha=true

Customizing a Helm Chart

You can create additional values.yaml files with customized values.

The following Helm command shows how a custom YAML file can be used to override the values in the original GigaSpaces Helm chart:

helm install insightedge -f customValues.yaml --name hello

Overriding the Processing Unit Properties

It is recommended to define the Processing Unit properties in the pu.xml as placeholders (as described in the Processing Unit Deployment Properties topic), so you can override these properties using the Helm chart.

After defining the properties as placeholders, use the key1=value1;key2=value2 format to pass the override values to the Helm chart using either the --set insightedge-pu.properties=<your key-value pairs> command, or using a custom YAML file.

Configuring the MemoryXtend Properties

The Kubernetes environment supports using MemoryXtend for off-heap RAM and MemoryXtend for Disk (SSD).

MemoryXtend for Off-Heap RAM

To configure your Kubernetes-based environment, you need to make sure that the container memory allocation is sufficient to accommodate the overall RAM requirements. Additionally, you should define the memory threshold properties as placeholders in the pu.xml file. For more information about the MemoryXtend Off-Heap RAM driver, see MemoryXtend for Off-Heap RAM.

MemoryXtend for Disk (SSD)

To configure your Kubernetes-based environment to use external storage, you need to enable persistent volume storage in both the Processing unit pu.xml and the pu Helm chart. This is described in detail in MemoryXtend for Disk (SSD/HDD).

For information about the Kubernetes persistent volume storage model, refer to the Kubernetes documentation.