Upgrade Supercluster
This topic describes installing a newer software version on PCEs in a Supercluster.
Important
The supercluster-quiesce and set-runlevel commands can be run on any node.
Before Upgrading
Before you upgrade the Supercluster, perform these steps:
Back up the PCE.
Before the upgrade, back up the leader and all member databases and each PCE's
runtime_env.yml
file.Ensure all PCEs are in a healthy state.
Before upgrading, make sure all PCEs in the entire Supercluster are in a healthy state.
In the PCE web console, check the PCE Health page to ensure the PCE health status is Normal.
Types of Supercluster Upgrade
You can choose to perform a simple upgrade or a rolling upgrade.
Supercluster simple upgrade: The Supercluster simple upgrade procedure requires you to set all the PCEs in the Supercluster to runlevel 1 for the duration of the upgrade. During a simple upgrade, the Supercluster is not fully operational. See Supercluster Simple Upgrade.
Supercluster rolling upgrade: A rolling upgrade keeps the Supercluster operational while individual PCEs are upgraded one at a time. See Supercluster Rolling Upgrade.
Note
Supercluster rolling upgrades are supported only for a hotfix or maintenance releases. The major and minor release numbers in the installed and upgrade versions must match. For example, you can do a rolling upgrade from 21.2.0 to 21.2.1.
Supercluster Simple Upgrade
A Supercluster simple upgrade follows these general steps:
On all PCEs, quiesce the data replication.
Synchronize data.
Upgrade the software on all nodes of all PCEs.
Migrate the database on all PCEs.
Bring all PCEs back to runlevel 5.
Steps for Upgrade
Quiesce data replication.
On any node in the PCE cluster, bring all PCEs to runlevel 2:
In the PCE clusters, repeat step (a) for all leaders and all members.
The cluster status should be RUNNING.
On any node in all PCE clusters, verify that the
set-runlevel
command finished and the cluster status isRUNNING
:sudo -u ilo-pce illumio-pce-ctl cluster-status -w
Do not proceed to the next step until the
set-runlevel
command finishes.Quiesce database replication.
On the active DB node, run the following command.
sudo -u ilo-pce illumio-pce-db-management supercluster-quiesce timeout_in_seconds
sudo -u ilo-pce illumio-pce-db-management supercluster-quiesce 600
This command waits for data replication to finish, which can take some time. To set a time limit, use
timeout_in_seconds
(default: 600). If the command doesn't complete within this time, it will stop. You must then run the command again.Expected output when database replication is successfully quiesced:
Replication is complete.
Synchronize the data.
Ensure replication data is properly synchronized between all PCEs in the Supercluster. Exactly how you ensure this synchronization differs depending on whether the version of the Supercluster PCEs being upgraded is older than 23.5.30 or not.
If upgrading from Supercluster PCEs running a release older than 23.5.30:
Contact your Illumio representative or Illumio Support to obtain a
supercluster_management.rb
script applicable to your current (pre-upgrade) PCE version.Copy the provided script to the PCE scripts directory (typically
/opt/illumio-pce/illumio/scripts
).Run the following command to ensure primary keys in replication tables match across all PCEs:
/opt/illumio-pce/illumio/scripts/supercluster-management.rb supercluster-replication-check -–detailed-data-check -–max-id-consistency-check
If this check fails, see Fixing Inconsistency Errors.
If upgrading from Supercluster PCEs running release 23.5.30 or newer:
Run the following command to ensure primary keys in replication tables match across all PCEs.
illumio-pce-ctl supercluster-replication-check --detailed-data-check --max-id-consistency-check
If this check fails, see Fixing Inconsistency Errors.
Upgrade the software.
Because this is a simple upgrade, you upgrade the software on all nodes of all PCEs in parallel.
On any node, stop the PCE cluster:
sudo -u ilo-pce illumio-pce-ctl cluster-stop
The packages to install depend on the type of PCE node:
Core nodes: Two packages, the PCE RPM and UI RPM.
Data nodes: One package, the PCE RPM.
On each core node in the cluster, log in as root and install the PCE RPM and UI RPM. Be sure to specify both of the RPM file names on the command line:
rpm -Uvh illumio_pce_rpm illumio_ui_rpm
For
illumio_pce_rpm
andillumio_ui_rpm
, substitute the paths and filenames of the two RPM files you downloaded from the Illumio Support portal.On each data node in the cluster, log in as root and install the PCE RPM:
rpm -Uvh <illumio_pce_rpm>
For
illumio_pce_rpm
, substitute the path and filename of the software you downloaded from the Illumio Support portal.On all nodes, start each cluster at runlevel 1:
sudo -u ilo-pce illumio-pce-ctl start --runlevel 1
Update the runtime environment file (
runtime_env.yml
).See "What's New in This Release" to determine whether any changes to
runtime_env.yml
are required to upgrade. If changes are required:On all nodes in the cluster, update the
runtime_env.yml
file.On all nodes in the cluster, check the validity of the
runtime_env.yml
file:sudo -u ilo-pce illumio-pce-ctl check-env
If any issues are reported by this command, correct them before moving on to the next step.
Migrate the PCE database.
On any node of every upgraded PCE, run the following command:
sudo -u ilo-pce illumio-pce-db-management migrate --upgrade-type simple
If you encounter a "
max id inconsistency
" error at this point, see Fixing Inconsistency Errors.The migration might take some time to complete. Check the progress with the following command:
sudo -u ilo-pce illumio-pce-db-management supercluster-upgrade-status
On any node in the first PCE cluster, bring all PCEs to runlevel 2. Repeat this step on all the other PCEs.
sudo -u ilo-pce illumio-pce-ctl set-runlevel 2
For all leader and member PCE clusters, repeat step b. Verify that all PCEs in the Supercluster are at runlevel 2.
Wait until
agent_slony_service
andlogin_slony_service
are up and running. These service names appear in bright blue or may have a pound character (#) appended, depending on which color option was chosen when starting the PCE,--color
or--no-color
. Do not restart the PCE. This step could take some time, depending on how recently you upgraded the PCE software. Run the following command to monitor the progress:sudo -u ilo-pce illumio-pce-ctl cluster-status -w
Issue the command again, when needed, until the services are ready.
Bring PCEs back to operational status.
On any node for each PCE, set the runlevel to 5 :
sudo -u ilo-pce illumio-pce-ctl set-runlevel 5
Setting the runlevel can take time to complete.
On any node in all PCE clusters, verify that the
set-runlevel
command finished and the cluster status isRUNNING
:sudo -u ilo-pce illumio-pce-ctl cluster-status -w
Note
Due to the time it takes to replicate new database tables across all the PCEs, the upgrade might take longer than usual. The delay occurs when you bring the PCE to runlevel 2 or 5 from runlevel 1 after upgrading the software. The wait time depends on the number of new tables that are part of the upgrade. The wait might be up to 20 minutes.
Verify that you can log into the PCE web console on each PCE in the Supercluster.
The upgrade is complete.
Fixing Inconsistency Errors
Failing to follow the recommended steps prior to upgrading a PCE Supercluster to version 23.5.x and later may lead to the PCE displaying errors like the ones shown below when running the migrate
command:
sudo -u ilo-pce illumio-pce-db-management migrate --upgrade-type simple Checking max id consistency... ------------------------------------------------------------ Fetching tables from ... Fetching tables from ... max_id mismatch found in PCE pair with region_id [1, 7] Replication max_id is NOT in a consistent state. max id check found inconsistency across PCEs. The migration cannot continue. Please contact Illumio support.
Inconsistencies encountered while running migrate
or the supercluster-replication-check
before attempting to migrate can be corrected by using the illumio-pce-ctl
command or the supercluster_management.rb
script, depending on the version the PCE is being upgraded from.
To correct inconsistencies on PCE versions older than 23.5.30
Note the name of each table that was reported as mismatched, and correct the mismatch by running the same
supercluster_management.rb
script from the Synchronize Data step for each mismatched table:/opt/illumio-pce/illumio/scripts/supercluster_management.rb supercluster-replication-sync --table-name <table name>
To correct inconsistencies on PCE versions newer than 23.5.30
Note the name of each table that was reported as mismatched, and correct the mismatch by running the
supercluster-replication-sync
command for each mismatched tableillumio-pce-ctl supercluster-replication-sync --table-name <table_name>
Supercluster Rolling Upgrade
In a rolling upgrade, the PCEs are upgraded one by one. The PCE that is being upgraded is at runlevel 1, while all the other PCEs are fully operational (runlevel 5).
Note
Supercluster rolling upgrades are supported only for a hotfix or maintenance releases. The major and minor release numbers in the installed and upgrade versions must match. For example, you can do a rolling upgrade from 21.2.0 to 21.2.1.
A Supercluster rolling upgrade follows these general steps:
Upgrade the software on all nodes of the leader PCE.
Migrate the database on the leader PCE.
Bring the leader PCE back to runlevel 5.
Repeat these steps for each member PCE.
Steps for Upgrade
Upgrade the software on the leader PCE.
On any node of the leader PCE, stop the PCE cluster:
sudo -u ilo-pce illumio-pce-ctl cluster-stop
The packages to install depend on the type of PCE node:
Core nodes: Two packages, the PCE RPM and UI RPM.
Data nodes: One package, the PCE RPM.
On each core node in the cluster, log in as root and install the PCE RPM and UI RPM. Be sure to specify both of the RPM file names on the command line:
rpm -Uvh illumio_pce_rpm illumio_ui_rpm
For
illumio_pce_rpm
andillumio_ui_rpm
, substitute the paths and filenames of the two RPM files you downloaded from the Illumio Support portal.On each data node in the cluster, log in as root and install the PCE RPM:
rpm -Uvh <illumio_pce_rpm>
For
illumio_pce_rpm
, substitute the path and filename of the software you downloaded from the Illumio Support portal.On all nodes, start the cluster at runlevel 1:
sudo -u ilo-pce illumio-pce-ctl start --runlevel 1
Update the runtime environment file (
runtime_env.yml
).See "What's New in This Release" to determine whether any changes to
runtime_env.yml
are required to upgrade. If changes are required:On all nodes in the cluster, update the
runtime_env.yml
file.On all nodes in the cluster, check the validity of the
runtime_env.yml
file:sudo -u ilo-pce illumio-pce-ctl check-env
If any issues are reported by this command, correct them before moving on to the next step.
Migrate the PCE database on the leader PCE.
On any node of the leader PCE, run the following command:
sudo -u ilo-pce illumio-pce-db-management migrate --upgrade-type rolling
The migration might take some time to complete. Check the progress with the following command:
sudo -u ilo-pce illumio-pce-db-management supercluster-upgrade-status
Bring the leader PCE back to operational status.
On any node of the leader PCE, set the runlevel to 5 :
sudo -u ilo-pce illumio-pce-ctl set-runlevel 5
Setting the runlevel can take time to complete.
On any node of the leader PCE, verify that the
set-runlevel
command finished and the cluster status isRUNNING
:sudo -u ilo-pce illumio-pce-ctl cluster-status -w
Upgrade the software on a member PCE.
On any node of the member PCE, stop the PCE cluster:
sudo -u ilo-pce illumio-pce-ctl cluster-stop
On all nodes of the member PCE, install the new version of the PCE. For information, see the PCE Installation and Upgrade Guide.
On any node, start the cluster at runlevel 1:
sudo -u ilo-pce illumio-pce-ctl start --runlevel 1
Migrate the PCE database on the member PCE.
On any node of the member PCE, run the following command:
sudo -u ilo-pce illumio-pce-db-management migrate
The migration might take some time to complete. Check the progress with the following command:
sudo -u ilo-pce illumio-pce-db-management supercluster-upgrade-status
Bring the member PCE back to operational status.
On any node of the member PCE, set the runlevel to 5 :
sudo -u ilo-pce illumio-pce-ctl set-runlevel 5
Setting the runlevel can take time to complete.
On any node of the member PCE, verify that the
set-runlevel
command finished and the cluster status isRUNNING
:sudo -u ilo-pce illumio-pce-ctl cluster-status -w
Repeat steps 4 through 6 for each additional member PCE.
Verify that you can log in to the PCE web console on each PCE in the Supercluster.
The upgrade is complete
During Supercluster Rolling Upgrade
During a rolling upgrade, if you log in to one of the PCEs, you will see a banner that states the Supercluster is in the process of a rolling upgrade.
The PCE Health page on the Leader displays the upgrade status for each PCE. The Upgrade Status column shows Pending if the PCE is in the process of being upgraded, and it shows Complete when the upgrade is complete. When the upgrade is finished, the Upgrade Status column no longer appears.
Rolling Upgrade Paths
For rolling upgrades, the major and minor versions must be the same, and only the patch number can be different (21.2.1: major 21, minor 2, patch 1).
This means that only HotFix (HF) or Maintenance Releases (MR) are qualified for Rolling Upgrades. Also, the HF or MR should NOT contain a migration script that changes the replication tables.
The tables below indicate whether Rolling Upgrade paths are allowed.
Upgrade Path | Allowed (Yes/No) | Notes |
---|---|---|
21.2.0 > 21.2.1 | Yes | |
21.2.1 > 21.2.2 | No | |
21.2.2 > 21.2.3 | Yes | |
21.2.3 -> 21.2.4 | Yes | |
21.2.4 -> 21.2.7 | Yes | |
21.5.2 | n/a | SaaS only release |
21.5.3 → 21.5.10 | Yes | |
21.5.10 → 21.5.12 | Yes | |
21.5.12 → 21.5.20 | Yes | |
21.5.20 → 21.5.21 | Yes | |
21.5.21 → 21.5.30 | Yes | |
21.5.30 → 21.5.31 | Yes | |
21.5.31 → 21.5.32 | Yes | |
21.5.32 → 21.5.33 | Yes | |
22.2.1 → 22.2.10 | Yes | |
22.2.10 → 22.2.20 | No |
|
22.2.20 → 22.2.30 | Yes | |
22.2.30 → 22.2.40 | No |
|
22.5.0 → 22.5.1 | Yes | |
22.5.1 → 22.5.2 | Yes | |
22.5.2 → 22.5.10 | No | PromoteCompatibilityCheckReports |
22.5.10 → 22.5.20 | No | AddLogFlowToRules |
22.5.20 → 22.5.21 | Yes | |
22.5.21 → 22.5.22 | Yes | |
22.5.22 → 22.5.23 | Yes | |
22.5.23 → 22.5.30 | Yes | |
22.5.30 → 22.5.31 | Yes | |
22.5.31 → 22.5.32 | Yes | |
23.5.10 → 23.5.20 | No | Postgres upgrade |
23.5.22 → 23.5.31 | No | Postgres upgrade |
Note
This list will be updated for new released versions. Currently, there are no Rolling Upgrades for vesions 24.x.
Upgrade Path | Allowed (Yes/No) | Notes |
---|---|---|
21.2.0 → 21.2.2 | No | |
21.2.0 → 21.2.3 | No | Due to 21.2.2 |
21.2.0 → 21.2.4 | No | Due to 21.2.2 |
21.2.1 → 21.2.3 | No | Due to 21.2.2 |
21.2.1 → 21.2.4 | No | Due to 21.2.2 |
21.2.2 → 21.2.4 | Yes |
Supercluster Listen Only Mode
The PCE Listen Only mode allows you stop the PCE from sending policy changes to your VENs. Enabling Listen Only mode for the PCE is typically used in these situations:
During PCE maintenance windows, and when starting the PCE back up
After restoring the PCE from a backup
During maintenance windows for other parts of your network environment
In Listen Only mode, VENs still report updated workload information to the PCE, but the PCE does not modify the firewall rules on any workloads or send any updates from the PCE to the VENs. Also, the PCE does not mark workloads as Offline, and does not remove them from policy when Listen Only mode is enabled.
When this mode is enabled, you can still write policy, pair new workloads, provision policy changes, assign or change workload Labels, but changes will not be sent to the VENs until you disable Listen Only mode. You can disable Listen Only mode when you are ready to resume normal policy operations.
Enable PCE Listen Only Mode
On all nodes in the PCE cluster, stop the PCE software:
sudo -u ilo-pce illumio-pce-ctl stop
On all nodes in the PCE cluster, set the node at runlevel 1:
sudo -u ilo-pce illumio-pce-ctl start --runlevel 1
On any data node, enable Listen Only mode:
sudo -u ilo-pce illumio-pce-ctl listen-only-mode enable
Set the PCE runlevel to 5:
sudo -u ilo-pce illumio-pce-ctl set-runlevel 5
Disable PCE Listen Only Mode
Note
The command to disable PCE Listen Only mode can be executed at either runlevel 1 or 5
Important
Disable PCE Listen Only Mode is not needed for rolling upgrades, but for a simple upgrade it is still needed.
On all nodes in the PCE cluster, stop the PCE software:
sudo -u ilo-pce illumio-pce-ctl stop
On all nodes in the PCE cluster, set the node to runlevel 1:
sudo -u ilo-pce illumio-pce-ctl start --runlevel 1
On any data node, disable Listen Only mode:
sudo -u ilo-pce illumio-pce-ctl listen-only-mode disable
Set the PCE runlevel to 5:
sudo -u ilo-pce illumio-pce-ctl set-runlevel 5