Geospatial API
This section gives a brief overview of Geospatial API and describes how to use it with Spark RDD and DataFrames.
For a detailed description of the Geospatial queries refer to Geospatial Queries for more info.
Shapes
Data Grid supports next shapes, all of them are located at org.openspaces.spatial.shapes
package:
Shape | Description |
---|---|
Point | A point, denoted by X and Y coordinates. |
LineString | A finite sequence of one or more consecutive line segments. |
Circle | A circle, denoted by a point and a radius. |
Rectangle | A rectangle aligned with the axis (for non-aligned rectangles use Polygon). |
Polygon | A finite sequence of consecutive line segments which denotes a bounded area. |
To create a shape, use the ShapeFactory
class, for example:
import org.openspaces.spatial.ShapeFactory
val userLocation = ShapeFactory.point(10, 10)
Queries
Geospatial API currently supports three operations: intersect
, within
and contains
.
Intersect
returns true
when the intersection between shape1
and shape2
is not empty - some or all of shape1
overlaps some or all of shape2
.
import org.insightedge.spark.implicits.all._
import org.openspaces.spatial.ShapeFactory._
// RDD
val rdd = sc.gridSql[SpatialData]("area spatial:intersects ?", Seq(circle(point(10,10), 10)))
// DataFrames
val df = sqlContext.read.grid[SpatialData]
val data = df.filter(df("area") geoIntersects circle(point(10,10), 10))
Within
returns true
when shape1
is within (contained in) shape2
, boundaries inclusive.
import org.insightedge.spark.implicits.all._
import org.openspaces.spatial.ShapeFactory._
// RDD
val rdd = sc.gridSql[SpatialData]("location spatial:within ?", Seq(circle(point(10,10), 10)))
// DataFrames
val df = sqlContext.read.grid[SpatialData]
val data = df.filter(df("location") geoWithin circle(point(10,10), 10))
Contains
returns true
when shape1
contains shape2
, boundaries inclusive.
import org.insightedge.spark.implicits.all._
import org.openspaces.spatial.ShapeFactory._
// RDD
val rdd = sc.gridSql[SpatialData]("area spatial:contains ?", Seq(point(10,10)))
// DataFrames
val df = sqlContext.read.grid[SpatialData]
val data = df.filter(df("area") geoContains point(10,10))
Indexing
You can define Geospatial index using the @SpaceSpatialIndex
and @SpaceSpatialIndexes
annotations:
import org.insightedge.scala.annotation._
import org.openspaces.spatial.shapes.Point
import scala.beans.BeanProperty
case class GasStation(
@BeanProperty @SpaceId var id: Long,
@BeanProperty @SpaceSpatialIndex var location: Point) {
def this() = this(-1, null)
}
To read more about indexing fields in Data Grid, refer to Geospatial Index.
Zeppelin Notebook
A great place to start experimenting with the Geospatial API is the Zeppelin notebook - check out the "GigaSpaces GeoSpatial' notebook.
If you're using Windows, you'll need to manually copy the jars from INSIGHTEDGE_HOME\datagrid\lib\optional\spatial
to the Spark jars folder (INSIGHTEDGE_HOME\jars
) to use this notebook. We're working of fixing that for the next release.