Developing Your First Application


This topic explains how to create an InsightEdge application that can read and write from/to the Data Grid. You should have a basic knowledge of Apache Spark .

See also:

For instructions on how to install a minimum InsightEdge cluster setup and launch it, refer to Starting InsightEdge.

Project Dependencies

InsightEdge 12.2 runs on Spark 2.2.0 and Scala 2.11.8. These dependencies will be included when you depend on the InsightEdge artifacts.

InsightEdge .jars are not published to Maven Central Repository yet. To install Maven artifacts run the following command from the ‘/insightedge/tools/maven’ directory:

insightedge-maven

For SBT projects include the following:

resolvers += Resolver.mavenLocal
resolvers += "Openspaces Maven Repository" at "http://maven-repository.openspaces.org"

libraryDependencies += "org.gigaspaces.insightedge" % "insightedge-core" % "12.2.0" % "provided" exclude("javax.jms", "jms")

And if you are building with Maven:

<dependency>
    <groupId>org.gigaspaces.insightedge</groupId>
    <artifactId>insightedge-core</artifactId>
    <version>12.2.0</version>
    <scope>provided</scope>
</dependency>
Info

InsightEdge .jars are already packed in the InsightEdge distribution, and are automatically loaded with your application if you submit them with insightedge-submit script or run the Web Notebook. As such, there is no need to pack them into your uber .jar. However, if you want to run Spark in local[*] mode, the dependencies should be declared with the compile scope.

Developing a Spark Application

InsightEdge provides an extension to the regular Spark API.

See also:

Read the Self-Contained Applications topic in the Apache Spark documentation if you are new to Spark.

InsightEdgeConfig is the starting point in connecting Spark with the Data Grid. Create the InsightEdgeConfig and the SparkContext:

import org.insightedge.spark.context.InsightEdgeConfig
import org.insightedge.spark.implicits.all._

val sparkConf = new SparkConf()
    .setAppName("sample-app")
    .setMaster("spark://127.0.0.1:7077")
    .setInsightEdgeConfig(InsightEdgeConfig("insightedge-space"))
val sc = new SparkContext(sparkConf)
Info

It is important to import org.insightedge.spark.implicits.all._ to enable the Data Grid specific API.

insightedge-space is the default Data Grid name that the demo mode starts automatically.

When you are running Spark applications from the Web Notebook, InsightEdgeConfig is created implicitly with the properties defined in the Spark interpreter.

Modeling Data Grid Objects

Create a case class Product.scala to represent a Product entity in the Data Grid:

import org.insightedge.scala.annotation._
import scala.beans.{BeanProperty, BooleanBeanProperty}

case class Product(   
   @BeanProperty @SpaceId var id: Long,
   @BeanProperty var description: String,
   @BeanProperty var quantity: Int,   
   @BooleanBeanProperty var featuredProduct: Boolean
) {
    def this() = this(-1, null, -1, false)
}

Saving to the Data Grid

To save a Spark RDD, use the saveToGrid method.

val rdd = sc.parallelize(1 to 1000).map(i => Product(i, "Description of product " + i, Random.nextInt(10), Random.nextBoolean()))
rdd.saveToGrid()

Loading and Analyzing Data from the Data Grid

Use the gridRdd method of the SparkContext to view Data Grid objects as Spark RDDs.

val gridRdd = sc.gridRdd[Product]()
println("total products quantity: " + gridRdd.map(_.quantity).sum())

Closing the Spark Context

When you are done, close the Spark context and all connections to Data Grid with the following command:

sc.stopInsightEdgeContext()

Under the hood, this will call the regular Spark sc.stop() command, so there is no need to call it manually.

Running your Spark Application

After you have packaged a .jar, submit the Spark job via insightedge-submit located in <XAP Home>/insightedge/bin instead of spark-submit as follows:

insightedge-submit --class com.insightedge.spark.example.YourMainClass --master spark://127.0.0.1:7077 path/to/jar/insightedge-examples.jar