Smart DIH - Platform Components
Smart DIH Smart DIH allows enterprises to develop and deploy digital services in an agile manner, without disturbing core business applications. This is achieved by creating an event-driven, highly performing, efficient and available replica of the data from multiple systems and applications, is an operational data hub designed to address IT challenges of supporting modern, digital applications over heterogeneous, mostly legacy data architectures. By creating an event-driven, highly performing, efficient and available replica of the data from multiple systems and applications, Smart DIH Digital Integration Hub. An application architecture that decouples digital applications from the systems of record, and aggregates operational data into a low-latency data fabric. allows enterprises to develop and deploy digital services in an agile manner, without disturbing core business applications. Strategic initiatives such as integration data hub, digital innovation over legacy systems, API scaling, cloud migration, and business 360 are common use cases for which Smart DIH is utilized.
Platform Components
At a high level, GigaSpaces Smart DIH platform bridges between Enterprise’s data sources and its high-end applications. It does so by streaming the data through these 3 stages (south to north):
-
Data Capture and Transformation
-
Caching and Backup
-
Data Servicing
Another aspect is the platform’s control and monitoring facilities. More information about this can be found in the Application Lifecycle page.
Data Integration
-
System of Record Agent
This thin agent which resides close to the System of Record (e.g., DB2), fetches raw data changes from the data source and sends it over to the transformation stage.
-
Change Data Capture (CDC Change Data Capture. A technology that identifies and captures changes made to data in a database, enabling real-time data integration and synchronization between systems. Primarily used for data that is frequently updated, such as user transactions.) Technology for Real-time Events
CDC tools are implemented to capture real-time changes from the on-premise System of Record. These tools capture and propagate changes to downstream systems.
-
Apache Kafka is used as a message bus to stream data change events from the agent to the data hub. Kafka ensures reliable and scalable data streaming.
-
Data Catalog
The DIH platform learns the source's data structures and creates a catalog where the source and data-grid metadata is kept.
-
Data Transformation
A layer that extracts and transforms the incoming data into an effective data structure. Multiple on-line functions can be applied on the stream..
Caching and Backup
-
GigaSpaces Data Hub
An in-memory data grid (IMDG In-Memory Data Grid. A simple to deploy, highly distributed, and cost-effective solution for accelerating and scaling services and applications. It is a high throughput and low latency data fabric that minimizes access to high-latency, hard-disk-drive-based or solid-state-drive-based data storage. The application and the data co-locate in the same memory space, reducing data movement over the network and providing both data and application scalability.) for high-performance caching. It provides fast access to frequently accessed data and enhances the system's overall responsiveness.
-
Through Intelligent Data Tiering, data can extend beyond the data-grid. The data grid oversees the persistence of data using a rule-based process.
Data Services
-
Micro-Services
Using Smart DIH, micro-services or server-less functions can be created be consuming data from data grid. These services cover specific business logic, providing a scalable and modular architecture. Refer to SpaceDeck – Services to see how micro-services can be created through SpaceDeck.
-
API Gateway
API Gateway Integration (e.g., AWS API Gateway, Azure API Management) manages and exposes APIs for data access and integration. The API Gateway can enforce security policies, rate limiting, and handle authentication.
Extensions
The following enhancements can be tailored to the needs of our customers:
-
With the mirror customization functionality a cloud-based data-store (e.g., Amazon RDS, Azure Database for PostgreSQL) can be used to replicate data persistently. This database serves as a replica of the on-premise System of Record.
-
Event Driven Application Push
GigaSpaces data-grid technology has the ability to notify the application in real time when there is any change to the data. To that end, the customer can add a program with the behavior and destination using special processing units This is the unit of packaging and deployment in the GigaSpaces Data Grid, and is essentially the main GigaSpaces service. The Processing Unit (PU) itself is typically deployed onto the Service Grid. When a Processing Unit is deployed, a Processing Unit instance is the actual runtime entity..
-
Network Data Sync
Apache Kafka is used as a message bus to stream data change events from the agent to the data hub. Kafka ensures reliable and scalable data streaming.
For information about Smart DIH refer back to the Smart DIH contents page and choose another topic.