Pipeline Setup Via SpaceDeck

Setting up a DIClosed The Data Integration (DI) layer is a vital part of the Digital Integration Hub (DIH) platform. It is responsible for a wide range of data integration tasks such as ingesting data in batches or streaming data changes. This is performed in real-time from various sources and systems of record (SOR. The data then resides in the In-Memory Data Grid (IMDG), or Space, of the GigaSpaces Smart DIH platform. pipeline requires the configuration of several components in the following order:

  1. Prepare the KubernetesClosed An open-source container orchestration system for automating software deployment, scaling, and management of containerized applications. cluster

  2. Create the SpaceClosed Where GigaSpaces data is stored. It is the logical cache that holds data objects in memory and might also hold them in layered in tiering. Data is hosted from multiple SoRs, consolidated as a unified data model.

  3. Create the data source

  4. Create and start the pipeline

Step 1: Prepare the Kubernetes Cluster

For installing a kubernetes cluster, follow the procedure as explained in the Smart DIH Kubernetes Installation page.

Step 2: Create a Space using SpaceDeck

For creating a Space, follow the procedure as explained in the SpaceDeck – Spaces - Adding a Space page.

Step 3: Create a Data Source using SpaceDeck

For creating a new data source, follow the procedure as explained in the SpaceDeck – Data Sources page, using the following attributes:

  • Data Source Name: ORACLE (name of your choice)

  • Data Source Type: select ORACLE from the dropdown menu

  • URL: iidr://di-oracledb:11001 (iidr://<hostname of oracle agent>:<port>)

  • Username: system

  • Password: admin 11

Step 4: Create a Pipeline using SpaceDeck

  1. For creating a Pipeline in SpaceDeck, follow the procedure as explained in the SpaceDeck – Data Pipeline – Create New Pipeline page using the following attributes in the Pipeline Configuration screen:

  • Pipeline name: demo 1

  • Space name: demo

  • Data Source Connection: ORACLE

2. Create the Pipeline

3. Add a new table to the demo1 Pipeline

  • After clicking Select Tables, select the RETAIL_DEMO schema.

  • Select the CUSTOMERS table

2. Start the Pipeline, select Point in time: EARLIEST

3. Verify that all the data from ORACLE appears in the Space

  1. Add a new row to the ORDERS table (for example, via DBeaver)

  2. Using SpaceDeck Data Query screen (see SpaceDeck – Data Query) type the query "SELECT count(*) FROM CUSTOMERS" and click "Run Query".

  3. Verify that all the data from Oracle is displayed.