Hazelcast

Hazelcast is an in-memory data grid used for clustering and is a highly scalable data distribution solution. Hazelcast helps in real-time low latency conditions and with distributed memory for data and computations. The distributed in-memory storage feature of Hazelcast is helpful in accelerating transactions, and Hazelcast also helps to improve the performance of internode communication using messaging.

Applications are deployed across several servers in a cluster and communicate with a single relational database in the application architecture, as in the following figure:

Application Architecture Nodes in a cluster

The repeated or multiple database hits can be avoided by making use of the cache that is embedded in each node. If the data is already in the cache, the request to the database is skipped and the data from the cache is used. If the data isn't in the cache, a request to the database is initiated, and the fetched data is used.

The following figure shows how for increased application performance, data that does not change frequently is saved in the cache of each node:

With this approach, there can be a cache consistency issue whenever data is updated by an application that is deployed on app server1. The data in cache1 and database are consistent, but cache2, cache3, and cache4 do not have the updated data.

The inconsistency problem can be solved by using an in-memory data grid. A data grid is a collection of servers that operate together in a distributed environment to manage data and related tasks. An in-memory data grid (IMDG) is the data that is entirely stored in RAM using key-value pairs.

Most enterprises that previously avoided adopting in-memory technology because of cost of memory cards are now changing the architecture of their systems to take advantage of the low-latency transaction processing of in-memory technology. Because of the significant decrease in the price of RAM, it is now more cost-effective to load the whole operational dataset into memory, resulting in much-improved performance.

The following products are some of the more popular IMDG implementations:

Hazelcast
Infinispan
Red Hat JBoss Data Grid
Ncache

Hazelcast and in-memory computing

Hazelcast is designed for environments where the power of in-memory computing can be used to accelerate applications that require low latency, high throughput, horizontal scaling, and security. Hazelcast can run embedded in every node, but also supports a client-server topology.

Pega Platform™ low-level features such as agents, Job Scheduler, and REST service rules use Hazelcast to cache or store the data.

Hazelcast IMDG is an ideal choice for services running on a cloud platform such as Kubernetes, where service replicas can quickly scale up and down.

Pega Platform deployments with Hazelcast can be used in two modes: embedded or client server. Going forward, Client-server mode is supported for Cloud K and Client-Managed Cloud (CMC) based deployments (such as Kubernetes-based deployments). For legacy deployments, embedded deployments are still supported.

Embedded Hazelcast

You use embedded Hazelcast if your application relies on high-performance computing and running a huge amount of tasks. Every Pega node connects to all other Pega nodes and forms a cluster. This deployment option has low-latency data access. The following figure shows an application and its cache data both operating on the same node. When fresh data is written to the cache, Hazelcast distributes it to the other members of the embedded Hazelcast network. When an application attempts to read data, it first looks in the cache of the node on which it is currently running.

An embedded Hazelcast deployment still has some issues:

Cluster instability
Split-Brain

Split-Brain

Split-Brain is a state of decomposition in which a single cluster of nodes are separated into multiple clusters of nodes, each operating as if the other no longer exists. Nodes can end up in a split-brain state through a process of cluster fracturing. Hazelcast automatically detects these situations and attempts to heal the cluster. When Hazelcast is unable to automatically recover a cluster, a split-brain merge failure message is reported in the logs and the system sends a PEGA0108 critical alert. When Hazelcast can automatically recover a cluster, you might first notice that some remote operations fail. However, service is restored shortly afterward. You then need to examine both the PegaRULES and PegaCLUSTER logs to understand the cause of the issue.

Client-server mode

To overcome the issues of embedded Hazelcast, client-server mode is recommended. Hazelcast in client-server mode works as an in-memory cache management platform that can provide greater speed and scalability in large multi-node clusters. Hazelcast supports high-performance transactions, real-time streaming, and fast analytics in a single, comprehensive data access and processing layer. This client-server model is more stable with large clusters.

Client-server mode is a clustering topology that separates Pega Platform processes from cluster communication and distributed features. Client-server mode clustering technology has separate resources and uses a different JVM than Pega Platform. The client nodes are Pega Platform nodes that perform application jobs and call the Hazelcast client to facilitate communication between Pega Platform and the Hazelcast servers. The servers are stand-alone Hazelcast servers that provide base clustering capabilities, including communication between the nodes and distributed features.

This optional deployment model introduces independent scalability for both servers and clients in Pega Platform, improving stability for deployments that use a large number of nodes.

Port ranges and boot priority with Hazelcast

Pega Platform uses the Enterprise version of Hazelcast, which comes with Pega Platform out of the box.

By default, Pega nodes use port range 5701 to 5800 for Hazelcast. To use a different port range, you edit the cluster/hazelcast/ports property in the prconfig.xml file. You might need to edit the port range if multiple environments run on the same host, or if the ports are already blocked or are in use.

To ensure high availability when using Hazelcast in client-server mode, Hazelcast requires at least 3 servers to be started before Pega Platform is started. At least one Hazelcast server needs to be in operation to start Pega Platform. Pega Platforms then need to be shut down before the Hazelcast servers are shut down.

The Hazelcast Management Center

The Hazelcast Management Center is a tool for managing and monitoring Hazelcast clusters. If you are using the client-server model for Hazelcast on Pega Platform, you can use the Hazelcast Management Center to gain insight into your Hazelcast clusters. The application offers information about resource usage, topology, and a variety of distributed objects such as queues, topics, and maps.

For more information about using Hazelcast with Pega Platform, see Deploying Hazelcast Management Center and Managing clusters with Hazelcast.

Check your knowledge with the following interaction:

This Topic is available in the following Module:

Pega Platform for the enterprise v4

Get help

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Yes

Want to help us improve this content?

Suggest an edit

Get Started with Community