PCE Supercluster Deployment Planning
This section describes requirements that you need to follow before deploying a PCE Supercluster.
Plan Supercluster FQDNs Carefully
Be sure to plan the fully qualified domain names (FQDNs) you want to use with your Supercluster PCEs. Be careful to define these names exactly how you want them before you deploy the Supercluster. Changing FQDNs after deploying a Supercluster is possible but time-consuming. The PCE FQDNs are set in the pce_fqdn
parameter in runtime_env.yml
.
For example, you might want to have identifying strings in the FQDNs that indicate the geographic location of the various members of the Supercluster, such as the following examples:
illumio-eu.bigco.com: eu
in the hostname indicates Europe.illumio.na.bigco.com:
North America as a separate domain.
You can also configure a global FQDN for the Supercluster. The global FQDN is used by the VENs rather than individual PCE FQDNs. The global Supercluster FQDN is set in the supercluster_fqdn
parameter in runtime_env.yml
.
When set, the PCE provides this FQDN instead of its own FQDN to VENs during pairing. This parameter must be set on all nodes in each PCE of the Supercluster. When you configure this option, each PCE server certificate must include the global FQDN in the SAN field. For example:
illumio-supercluster.bigco.com
Number of Supercluster PCEs
A PCE Supercluster consists of a minimum of two and a maximum of eight (8) PCEs. One of the PCEs is always the Supercluster leader, while the others are Supercluster members.
Capacity Planning for Supercluster PCEs
Recommended CPU, Memory, and Storage
Maximum Flow Capacity
Storage Device Layout
Runtime Parameters for Two-Storage-Device Configuration
In the two-storage-device configuration, to accommodate growth in the traffic data store, set the following parameters in runtime_env.yml
:
Note
When you are deploying the two-storage-device configuration, you must set these parameters.
traffic_datastore:
data_dir
:path_to_second_disk
max_disk_usage_gb
: Set this parameter according to the table below.
partition_fraction
: Set this parameter according to the table below.
time_bucket_type
: Set this parameter according to the table below.
The recommended values for the above parameters based on PCE node cluster type (2x2 or 4x2) and the estimated number of workloads (VENs) are as follows:
Setting | 2x2 | 2,500 VENs | 2x2 | 10,000 VENs | 4x2 | 25,000 VENs | Note |
---|---|---|---|---|
| 100 GB | 400 GB | 400 GB | This size reflects only part of the required total size, as detailed in "PCE Capacity Planning" in the PCE Installation and Upgrade Guide. |
| 0.5 | 0.5 | 0.5 | |
| Day | Day | Day |
Network Traffic Between PCEs
PCEs in the Supercluster communicate via the following ports. Any network firewalls between the PCEs must be configured to allow this traffic.
Ports | Sources | Destinations |
---|---|---|
The default TCP 8443 or the management port configured for the PCE Web Console and REST API in This port must be the same on all PCEs in the Supercluster. | Core nodes of leader PCE | PCE FQDN of all member PCEs |
TCP 5432 | All nodes of all PCEs | IP addresses of all other PCE data nodes |
TCP 5532 | Core nodes of leader PCE | IP addresses of all other PCE data nodes |
TCP 8302 | All nodes of all PCEs | PCE FQDN of all other PCEs and IP address of all nodes of all other PCEs |
UDP 8302 | All nodes of all PCEs | IP address of all nodes of all other PCEs |
TCP 8300 | All nodes of all PCEs | IP address of all nodes of all other PCEs |
Load Balancers
Similar to a single PCE, all PCEs in the Supercluster must be front-ended with a load balancer (DNS or L4) to distribute requests across the PCEs' core nodes.
GSLB or a manual DNS update can be used to fail over VENs to a different PCE. See GSLB Requirements and High Availability and Disaster Recovery.
Traffic Load Balancer Configuration
When you use L4 load balancers in front of the PCEs, the load balancers should already be configured to forward inbound connections on the default TCP 8443 or the management port configured for the PCE web console and REST API in runtime_env.yml
and 8444 to an available, healthy core node.
In a Supercluster, the L4 load balancer must also be configured to forward additional inbound TCP 8302 connections originating from the other PCEs to an available, healthy core node.
GSLB Requirements
Workloads can be paired to a specific PCE, or you can optionally use a GSLB to route workloads to the required PCE in your Supercluster.
When you are using a GSLB to route workloads, consider the following general guidelines.
For normal operations:
When all PCEs are available, workloads should be routed to the nearest PCE based on proximity and geolocation.
GSLB persistence (also known as “stickiness”) must be enabled so workloads are always routed to the same PCE that they are paired with (non-failure case). Balancing workloads across multiple PCEs is not supported.
For failover:
Recommended: A dedicated failover PCE joined to the Supercluster that has no other VENs.
Failover to any other PCE in the Supercluster. In this case, take care to prevent overloading the PCE beyond its rated capacity and to avoid cascading failures. One strategy is to configure a “buddy” PCE for each PCE that the GSLB uses for failover.
Workload failover time depends on the DNS time-to-live (TTL) configured in the GSLB.
Illumio strongly recommends that you do not automate workload failover using GSLB and instead initiate it manually.
Configure SAML IdP for User Login
After installation, you can configure the PCE to rely on an external, third-party SAML identity source system IIdP). See "Single Sign-On Configuration" in PCE Administration Guide .The guide provides set up instructions for a wide variety of IdPs.
For the PCE Supercluster, you configure the details in the leader PCE web console exactly as you do for the standalone PCE, with one exception: you are presented an intermediate page that lists all the PCEs in the Supercluster, including the leader and all members. Follow the same processes detailed in PCE Administration Guide to configure all the Supercluster PCEs, both leader and members.
Certificate Requirements
PCE-to-PCE communication is done over TLS v1.2. The root CA certificate that signed each PCEs certificate must be in the root CA bundle on all other PCEs in the Supercluster.
Object Limits and Supercluster
The PCE enforces certain soft and hard limits to restrict the total number of system objects you can create. These limits are based on tested performance and capacity limits of the PCE. Most PCE object limits apply to the entire Supercluster. The limits are enforced by the leader when objects are created.
The object limit for the number of VENs per PCE (active_agents_per_pce
) is not cluster-wide and applies to each PCE. When the VENs per PCE limit is reached, no more VENs can be paired to that PCE. This limit is enforced by moving VENs from one PCE to another via the REST API.
An exception is made when VENs are failed over by the system itself from one PCE to a different PCE in the cluster. The VENs that failover do not count towards the limit, allowing you to temporarily exceed the limit of VENs per PCE when an extended outage to a PCE in the Supercluster occurs.
Changes to the object limit for the number of VENs per PCE (active_agents_per_pce
) made on the Supercluster leader are propagated to the members within 30 minutes.
For more information on object limits and how to view your current object limit usage, see PCE Administration Guide, the command 'illumio-pce-ctl obj-limits list".
RBAC Permissions: Leader or Member
In general, when you are using the Illumio PCE web console or the Illumio REST API, the types of operations you can perform depend on your PCE role-based access control (RBAC) permissions and whether you have logged into the leader or a member, as shown in the table below.
User Role | Operations | Leader | Members |
---|---|---|---|
Any Role | View objects | Yes | Yes |
Global Administrator & User Manager (Organization Owner) | Add, delete users Add, modify, delete, and provision system objects and rulesets (includes creating a pairing script). | Yes | No |
Global Administrator | Add, modify, delete, and provision system objects and rulesets (includes creating a pairing script) | Yes | No |
Global read-only | View all objects | Yes | Yes |
Global Policy Object Provisioner | Provision system objects | Yes | No |
Ruleset Manager | Create, update, and delete rulesets within defined scopes. | Yes | No |
Ruleset Provisioner | Provision rulesets within defined scopes. | Yes | No |
Process, File Limits, and Kernel Parameters
Even if you are running systemd, the file and kernel limits must be set as outlined in our build document for init.d
systems.
Servers with systemd
also need to make the configuration changes outlined for init.d
systems because some of our supercluster command-line tool commands are hard-coded to reference init.d
security limits. It is necessary to set file and process limits for both configuration file changes. Please refer to our build documentation for the required settings.
For reference, see "Requirements for PCE Installation" in "PCE Installation and Upgrade".
Configure PCE Internal Syslog on Leader
You can configure the PCE's internal syslog service in the PCE web console on the Supercluster leader, for both the leader and the member PCEs. The internal syslog cannot be configured on a member PCE.
Note
When a standalone PCE is installed, a local destination for the PCE internal syslog is created to record events. When the PCE is joined as a member of the Supercluster, this local destination is removed.
After joining a member, you have to log into the Supercluster leader and configure the internal syslog for each member individually.
When the events occurring before joining a PCE as a member are essential to preserve, back up the PCE before you join it to the Supercluster.
See PCE Installation and Upgrade Guide for information about the PCE internal syslog.