Monitor Supercluster Health
You can use these two general methods for monitoring the health of your PCE Supercluster:
REST API calls to determine the Supercluster leader and a PCE member's health
The PCE web console to view the health of the entire Supercluster from the leader or for the member you are logged into
This section discusses health monitoring specifically for a PCE Supercluster. Additionally, follow the PCE health monitoring guidelines in the PCE Administration Guide.
REST API for Supercluster Health
You can monitor Supercluster health using the following REST API mechanisms.
REST API /health
Using the PCE Health API, you can get current health information about all PCEs in your Supercluster, including the leader and members.
GET [api_version]/health
REST API to Determine Supercluster Leader
Use this Public Stable REST API request to determine whether the PCE in a Supercluster is a leader or member.
GET [api_version]/supercluster/leader
HTTP Response Code from /supercluster/leader
Response | Meaning |
---|---|
202 | The PCE is the leader. |
404 | The PCE is a member. |
REST API /node_available
After you determine the Supercluster leader, issue the following REST API request to monitor the leader's availability:
GET [api_version]/node_available
HTTP response code from /node_available
The Health REST API can take up to 30 seconds to reflect the actual status of the node.
Response | Meaning |
---|---|
202 | The node is healthy and is connected to the rest of the cluster. |
404 or no response | The node is unhealthy and cannot accept requests. Such a node should be removed from the load balancing pool. |
PCE Web Console for Supercluster Health
The Health page in the PCE web console in a Supercluster provides health information about your on-premises PCE, whether you deployed an SNC, 2x2, 4x2, or Supercluster.
General PCE Health: Shows general health information for each PCE in your Supercluster, such as health status, node status and uptime, and system health information for each node (CPU usage, memory, and disk usage). When you deployed a PCE Supercluster, the Health page lists all PCEs in the Supercluster with individual health information for each PCE.
Supercluster Leader Health: Displays the health status of the leader PCE in the Supercluster. You can view the health of each PCE in the Supercluster.
Supercluster Member Health: Shows health information about the member you are logged into, including a timer that indicates the amount of time since Illumination data was synced across the Supercluster. The Health page shows the database replication lag for each PCE relative to all other PCEs in the Supercluster, indicating how long it took for data to be replicated from one PCE to another.
The PCE Health page indicates the current state of database replication across the Supercluster and how recently each member PCE's Illumination data has been synced with the leader.
Supercluster Replication (Lag): Indicates how long it took for one PCE to receive replicated data from another PCE in the Supercluster. For example, a user created a new IP list in the leader and saved it. The change took 4 seconds to replicate to Member1 and Member1's Health page showed that its replication lag is 4 seconds behind the leader. The PCE web console shows replication lag for each PCE in the Supercluster.
Supercluster Illumination Sync (Members only): Shows the last time since a member PCE replicated its Illumination traffic data with the Supercluster leader. This information only appears for members that periodically send traffic data to the leader. This information provides a full picture of Illumination traffic for your entire Supercluster. You can initiate a sync of Illumination data on demand by clicking the link in the lower right of the Illumination map.
Supercluster PCE Health Icon
When the PCE Health button has a badge with a number, one or more of the PCEs in your Supercluster have a health status that is not “Normal.” The badge color indicates the type of warning.
For example, a yellow warning badge with the number 1 indicates that one of the PCEs in the Supercluster has a health warning status.
When the badge is red and shows the number 1, one of the Supercluster PCEs has failed or is down.
Supercluster Web Console Health Page
The Supercluster Health page on the leader displays a high-level view of each PCE's health. You can click a PCE to view individual health information. The information on this page is refreshed every 60 seconds.
Individual PCE Health Status
The following table lists the possible health statuses for a PCE: Normal, Warning, or Critical.
Status | Color | Definition |
---|---|---|
Normal (healthy) | Green | A PCE is considered to be in a normal state when:
|
Warning | Yellow | A PCE is considered to be in a warning state when:
|
Critical | Red | A PCE is considered to be in a critical state when one or more required services are missing. In this scenario, it might not be possible to authenticate to the PCE or get a REST API response depending on which services are missing from the PCE. |
PCE Health on Workload Details
When your workloads have been paired with a Supercluster leader or member, you can view PCE health on the Summary tab of the Workload details page. This page includes the PCE section, which lists the hostname and health of the PCE that this workload is paired with.
PCE Health on Illumination Command Panel
When you select a workload in the Illumination map in a Supercluster, the command panel that displays workload details includes the health of the PCE that the workload is paired with. For example, you can see the health status of the PCE the workload is paired with in the PCE Health field.
Command to Show All Supercluster Members
On any core node or the data0 node in a cluster, run the following command to display the leader and all member PCEs of the Supercluster.
sudo -u ilo-pce illumio-pce-ctl supercluster-members