SpaceDeck – Data Pipeline – Create New Pipeline

Data Pipelines allow a convenient, no-code method to pipe data from the System of Record to the GigaSpaces in-memory data grid.

A new data pipeline definition will include the definitions of the System of Record databases, tables and fields that will provide data to the pipeline. The definition also indicates the in-memory Space that will receive the pipeline data.

Additional information includes optional validation rules and automatic conversion of specified field definitions.

 

Display the Configuration screen

From the Data Pipeline Status screen, press New + to begin the pipeline definition process.

Pipeline Configuration Screen

The Pipeline Configuration screen appears as follows:

Basic Pipeline Information

You can fill in some or all of the pipeline configuration items (shown below) from a JSON-format configuration file by clicking on the Load Configuration button.
The configuration file may contains some or all of the required details. After the configuration details are loaded from the configuration file, they can be edited if desired, before saving .

  • Pipeline Name – name assigned to the pipeline

  • Space Name – the name of the GigaSpaces Space object that will receive the pipeline data

  • Connector Type – the data connector type, for example, IIDR.

  • Connector Setting:

    • Data Source Connection – the data source from the System of Record. This is a URL and points to a database such as DB2.

    • CDC Kafka Topic – The name of the Kafka topic for CDC changes

    • SYNC Kafka Topic – The name of the Kafka topic for initial load changes

  • Advanced Setting:

    • Batch Write – size of the single batch write from a DI layer to the space. The value specified here is the number of commands.

    • Checkpoint Interval – interval, in milliseconds, that the data integration layer performs a commit to Kafka and flush to Space

Press Create Pipeline to create the new data pipeline.

Select Tables for the Pipeline

You may then press Select Tables to choose which tables to include in the pipeline. Press Add to add the selected tables to the pipeline.

Edit the Pipeline

Edit Pipeline – Parameters Tab

This tab will be supported in a future release.

 

Edit Pipeline – Fields Tab

In the Pipeline fields section of this screen, the Field (column) names are initially the names of the fields from the database table that are included in the data pipeline. These can be edited to provide different property names (column names) in the GigaSpaces object type (table).
Other fields in this screen will be editable in a future release.

Starting the Data Pipeline

After you have added the tables and saved the pipeline, save the changes and press Start to start the pipeline.

The pipeline will show as Started in the Data Pipeline Status screen: