Skip to main content

REST APIs for 24.2.20 and 24.2.10

PCE Health Reference

This topic covers properties, parameters, and examples for PCE health.

Check PCE Health

Curl Command

curl -i -X GET https://pce.my-company.com:8443/api/v2/health -H 'Accept: application/json' -u $KEY:'TOKEN'
Properties

Property

Description

Type

status

Current health status of the PCE. Possible values:

  • normal: When a PCE's health is in a normal state, it means:

    • All required services are running.

    • All nodes are running.

    • CPU usage of all nodes is less than 95%.

    • Memory usage of all nodes is less than 95%.

    • Disk usage of all nodes is less than 95%.

    • Database replication lag is less than or equal to 30 seconds.

  • warning: When PCE health is in a warning state, it means:

    • One or more nodes are unreachable.

    • One or more optional services are missing, or one or more required services have been degraded.

    • The CPU usage of any node is greater than or equal to 95%.

    • Memory usage of any node is greater than or equal to 95%.

    • Disk usage of any node is greater than or equal to 95%.

    • Database replication lag is greater than 30 seconds.

  • critical: A PCE is considered to be in a critical state when one or more required services are missing.

    If a PCE enters a critical state, it might not be possible to authenticate to the PCE or get an API response depending on which services are missing from the PCE.

String

type

The type of PCE:

  • standalone: Indicates that this PCE is an on-premises 2x2 or 4x2 PCE cluster.

    Or one of the following types:

  • leader: Indicates that this PCE is the leader of a Supercluster.

  • member: Indicates that this PCE is a member of a Supercluster.

String

fqdn

The fully qualified domain name (FQDN) of the PCE.

String

available_ seconds

The length of time that this PCE has been available is measured in seconds.

Number

notifications

Health warnings related to the PCE which contain the following properties:

  • status: Severity status of this notification. Possible values include: normal, warning, or critical.

  • token: Description of the notification.

  • message: Notification message.

listen_only_mode_ enabled_at

Indicates when listen-only mode was enabled for this PCE.

For information about enabling or disabling listen-only mode for a PCE, see PCE Administration Guide.

String

nodes

The nodes that comprise your PCE cluster.

For each node of your PCE, this API call returns the following properties:

  • hostname: The node hostname.

  • ip_address: The node IP address.

  • runlevel: (Number) The current runlevel of the PCE software on the node.

    For more information about runlevels and their usage, see PCE Administration Guide.

  • uptime_seconds: Seconds since this node has been restarted.

  • cpu: Percentage of the node CPU being used.

    Includes the following two sub-properties:

    • status: Either normal, warning, or critical.

    • percent: (Number) Percentage of the node CPU being used.

  • disk: Percentage of the node's disk that is being used.

    Includes the following two sub-properties:

    • status: Either normal, warning, or critical.

    • percent: (Number) Percentage of the node disk being used.

  • memory: Percentage of the node's memory that is being used.

    Includes the following two sub-properties:

    • status: Either normal, warning, or critical.

    • percent: (Number) Percentage of the node disk being used.

  • services: The status of all PCE services running on the node.

    Possible status for PCE services include:

    • running: The service is fully running and operational.

    • not running: The service has stopped running.

    • partial: The service is running but in a partial state.

    • optional

    • unknown

  • generated_at: Timestamp when this information was generated.

String

network

PCE 2x2 or 4x2 Deployment

For a PCE 2x2 or 4x2 deployment, the networkproperty provides latency information between the database primary and database replica data nodes in your PCE for policy and traffic data.

This property also indicates which data node in your PCE is the primary database and which is the database replica.

This type of database replication is called intracluster in the REST API.

Sub-properties include:

replication: The category of properties that provide database replication latency information for a PCE cluster. (For a PCE Supercluster, this information is provided for each PCE in the Supercluster.)

  • type: Type of replication. intracluster for a PCE 2x2 or 4x2 deployment.

  • details: Includes the following properties:

    • database_name: Either agent for policy data or traffic for traffic data.

    • primary_fqdn: The FQDN of the database primarynode.

    • replica_fqdn: FQDN of the replica database node.

  • value: The amount of replication lag between the primary and database replica for both policy and traffic data.

    • status: Either normal, warning, or critical.

    • lag_seconds: The amount of lag measured in seconds between the primary and replica databases for both policy and traffic data.

Supercluster Deployment

If you have deployed a PCE Supercluster, the PCE health call also returns information about the database replication between the PCE you are currently logged into and all other PCEs in the Supercluster.

In a Supercluster deployment, the security policy provisioned on the leader is replicated to all other PCEs in the Supercluster. Additionally, all PCEs in the Supercluster (leader and members) replicate copies of each workload's context, such as IP addresses, to all other PCEs in the Supercluster.

This other type of database replication for a Supercluster is called intercluster in the REST API, and information is provided for all PCEs in the Supercluster.

Properties include:

replication: The category of properties that provide database replication latency information for a PCE cluster:

  • type: Type of replication. intercluster for a PCE Supercluster deployment.

  • details: Includes the following properties:

    • fqdn: The FQDN of the primary database of the other PCEs listed in this section.

  • value: The amount of replication lag between the PCE you are logged into and one of the other PCEs in the Supercluster.

    • status: Either normal, warning, or critical.

    • lag_seconds: The amount of lag measured in seconds between the PCE you are logged into and the other PCE listed in this section.

Array

generated_at

The timestamp of when the information was generated.

String

PCE Health Response

Example response returned from the PCE Health API.

[
    {
        "status": "normal",
        "type": "standalone",
        "fqdn": "pce.mycompany.com",
        "available_seconds": 84133,
        "notifications": [],
        "listen_only_mode_enabled_at": null,
        "nodes": [
            {
                "hostname": "pce_core1.mycompany.com,
                "ip_address": "192.0.1.0",
                "type": "core",
                "runlevel": 5,
                "uptime_seconds": 2051301,
                "cpu": {
                    "status": "normal",
                    "percent": 7
                },
                "disk": [
                    {
                        "location": "disk",
                        "value": {
                            "status": "normal",
                            "percent": 17
                        }
                    }
                ],
                "memory": {
                    "status": "warning",
                    "percent": 85
                },
                "services": {
                    "status": "normal",
                    "services": {
                        "running": [
                            "agent_background_worker_service",
                            "agent_service",
                            "agent_traffic_service",
                            "auditable_events_service",
                            "collector_service",
                            "ev_service",
                            "executor_service",
                            "fluentd_source_service",
                            "login_service",
                            "memcached",
                            "node_monitor",
                            "search_index_service",
                            "server_load_balancer",
                            "service_discovery_server",
                            "traffic_worker_service",
                            "web_server",                
                        ]
                    }
                },
                "generated_at": "2020-03-03T19:38:52+00:00"
            },
            }
        ],
        "network": {
            "replication": [
                {
                    "type": "intracluster",
                    "details": {
                        "database_name": "agent",
                        "primary_fqdn": "bkhorram-qa-6node-v0-pce-1-dbase0"
                    },
                    "value": {
                        "status": "normal",
                        "lag_seconds": 0
                    }
                },
                {
                    "type": "intracluster",
                    "details": {
                        "database_name": "traffic",
                        "primary_fqdn": "bkhorram-qa-6node-v0-pce-1-dbase0"
                    },
                    "value": {
                        "status": "normal",
                        "lag_seconds": 0
                    }
                }
            ]
        },
        "generated_at": "2020-03-03T19:38:52+00:00"
    }
]