Smart DIH

Smart DIH is an operational data hub designed to address IT challenges of supporting modern, digital applications over heterogeneous, mostly legacy data architectures. By creating an event-driven, highly performant and highly available replica of the data from multiple systems and applications, Smart DIHClosed Digital Integration Hub. An application architecture that decouples digital applications from the systems of record, and aggregates operational data into a low-latency data fabric. allows enterprises to develop and deploy digital services in an agile manner, without disturbing core business applications. Strategic initiatives such as integration data hub, digital innovation over legacy systems, API scaling, cloud migration, and business 360 are common use cases for which Smart DIH is utilized.


The Smart DIH conceptual architecture can be illustrated as follows:


The Data Integration (DIClosed The Data Integration (DI) layer is a vital part of the Digital Integration Hub (DIH) platform. It is responsible for a wide range of data integration tasks such as ingesting data in batches or streaming data changes. This is performed in real-time from various sources and systems of record (SOR. The data then resides in the In-Memory Data Grid (IMDG), or Space, of the GigaSpaces Smart DIH platform.) module enables the event-driven replication of data from source systems to the Smart DIH. Built-in Changed Data Capture (CDCClosed Change Data Capture. Primarily used for data that is frequently updated, such as user transactions) connectors enable streaming of data from common databases, through no-code data pipelines. Aside from enabling connectivity, the data pipelines are designed to support metadata change and recovery scenarios in a way that reduces or even eliminates service downtime.


The replicated data is hosted on a highly available, distributed in-memory data grid (IMDGClosed In-Memory Data Grid. A set of Space instances, typically running within their respective processing unit instances. The space instances are connected to each other to form a space cluster. The relations between the spaces define the data grid topology. Also known as Enterprise Data Grid - EDG), making the data highly available to the consuming applications, regardless of the availability of any of the source systems. An IMDG instance used for hosting the data is called a SpaceClosed Where GigaSpaces data is stored. It is the logical cache that holds data objects in memory and might also hold them in layered in tiering. Data is hosted from multiple SoRs, consolidated as a unified data model.. A Space is an object store and is rich with additional functionalities, yet for simplicity, a Space can be thought of as the equivalent of a database schema. As such, it stores the data in Object Types, also known as Space Types. A Space Type is the equivalent of a Space for a table in a database. A Space may be fully deployed to memory, as is the default configuration. For cost considerations, enterprises may take advantage of the Space’s tiered storage configuration, allowing the persistence of data to SSD, while caching only the business-critical data to memory. It is important to note that the logic of caching to memory in a tiered storageClosed Automatically assigns data to different categories of storage types based on considerations of cost, performance, availability, and recovery. configuration is consistent, as it is based on business rules.


Making data available to consuming applications is accomplished via data access services, exposing either APIs and/or events. Smart DIH provides low-code tooling for creating and deploying such services, among which are tools to create an SQL-based service. SQL is also the vehicle through which external SQL tools such as BI, ETLClosed Extract, Transform, Load The process of combining data from multiple sources into a large, central repository. In GigaSpaces this is the Space. and database applications can interact with the Smart DIH, via its clientless JDBC / ODBC connector.

Regardless of the method chosen to access the data, having it all being served over the Smart DIH simplifies the service development process. In practice, it may be either that data of different domains is replicated to the same Space or that multiple Spaces are being utilized, each as a replica of a different data domain. The latter configuration is mostly used when the desire is to keep data ownership and control at the LoB level (e.g. in accordance with Kappa architecture and the Data Mesh principles), even when centralizing the data for common usage.


Smart DIH may be deployed on-prem or in the cloud. Hybrid on-prem / cloud deployment topologies are common when at least some of the source systems run on-premise. As a cloud native and also cloud agnostic technology, Smart DIH can be deployed in any combination of on-prem / cloud / multi-cloud distribution, while keeping its instances fully synchronized, using its WAN Gateway module.


For more information about Smart DIH visit our Architectural Overview pages

For installation instructions refer to our Smart DIH Kubernetes Installation guide