Skip to main content

Illumio Core 23.5 Administration Guide

Database Migration, Failover, and Restore

This section describes how to perform database management tasks.

Migrate PCE Databases

These steps explain how to migrate the database from a previous version to a current one. You must run this command at runlevel 1 in the following cases:

  • After you have upgraded to a newer version of the PCE software.

  • After restoring a backup file from a previous version of the PCE software.

  • After you have completed a new PCE build and installation and initialized the database via the Illumio-pce-db-management setup command.

To migrate the PCE database:

  1. On any node, migrate the PCE database:

    $ sudo -u ilo-pce illumio-pce-db-management migrate
  2. On the primary database, set the cluster to runlevel 5:

    $ sudo -u ilo-pce illumio-pce-ctl set-runlevel 5 

    Setting runlevel might take some time to complete.

  3. Check the progress to see when the status is RUNNING:

    $ sudo -u ilo-pce illumio-pce-ctl cluster-status -w
Manage Automatic Database Failover

When the primary database experiences a failure event lasting more than 2 minutes, the PCE automatically fails over to the backup database. Failing over the database causes other PCE services to restart. During the database failover period, REST API requests might fail and the PCE web console might become unresponsive.

When the primary database node comes back online and rejoins the cluster, it will detect it is no longer the primary and become the backup database.

Determine Which Node Is Primary

Note

When you install the PCE software, the first data node you install becomes the primary database. Upgrading the PCE does not change the primary database to another data node.

$ sudo -u ilo-pce illumio-pce-db-management show-master
View Auto Failover Mode
$ sudo -u ilo-pce illumio-pce-db-management get-auto-failover

Example output:

$ sudo -u ilo-pce illumio-pce-db-management get-auto-failover

Database Failover mode: 'off' 
Turn Auto Failover Off or On

Automatic failover is enabled by default. To disable it, run the following command:

$ sudo -u ilo-pce illumio-pce-db-management set-auto-failover off
Manual Database Failover
  1. Determine which node that is running as the primary database:

    $ sudo -u ilo-pce illumio-pce-db-management show-master
  2. On the primary database node, stop the PCE software on the node:

    $ sudo -u ilo-pce illumio-pce-ctl stop

    Wait roughly two minutes for the new node to take over.

  3. On the new database node, verify that the database service is running:

    $ sudo -u ilo-pce illumio-pce-db-management show-master
  4. On the previous primary database node in the PCE cluster, restart the PCE software:

    $ sudo -u ilo-pce illumio-pce-ctl start

    After the node starts, the PCE recognizes it as the replica database node and will sync it with the primary database node.

Restore from Data Backup

This task describes how to restore a PCE cluster from a data backup.

We can restore to a different FQDN using the --update-fqdn option on the restore for the policy DB. This requires the runtime_env.yml to have the pce_fqdn option set to the new PCE FQDN before running the PCE in runlevel 1.

Note

Illumio recommends waiting at least 15 minutes to restore a policy database backup after taking it. When you restore a policy database backup sooner than 15 minutes, the PCE might only apply policy correctly to some workloads.

  1. On all nodes in the PCE cluster, stop the PCE software:

    $ sudo -u ilo-pce illumio-pce-ctl stop
  2. On all nodes in the PCE cluster, start the PCE at runlevel 1:

    $ sudo -u ilo-pce illumio-pce-ctl start --runlevel 1
  3. On any node, verify the runlevel:

    $ sudo -u ilo-pce illumio-pce-ctl cluster-status -w
  4. Restore the policy database to the data node that is running the agent_traffic_redis_server service. (For information about how to determine which node this is, see the "Back Up the Policy Database" topic.)

    $ sudo -u ilo-pce illumio-pce-db-management restore --file /path/to/policy_db_dump_file
    $ sudo -u ilo-pce illumio-pce-db-management migrate
  5. Copy the Illumination data file from the primary data node that is running the agent_traffic_redis_server service to the replica data node. The file is located in the following directory on both nodes.

    persistent_data_root/redis/redis_traffic_0_master.rdb
  6. Restore the traffic database. Run this command on the same node where you took the traffic database backup.

    $ sudo -u ilo-pce illumio-pce-db-management traffic restore --file /path/to/traffic_db_dump_file

    When prompted to bring the PCE to runlevel 5, reply “yes” if you want the PCE to automatically finish migrating the traffic database and bring the PCE to fully operational status. Reply “no” if you don’t want to migrate the traffic database.

  7. If you chose "no" in the previous step:

    1. Return the PCE cluster to runlevel 5:

    $ sudo -u ilo-pce illumio-pce-ctl set-runlevel 5
  8. On any node, verify the runlevel is 5:

    $ sudo -u ilo-pce illumio-pce-ctl cluster-status -w
  9. Take the PCE out of Listen Only mode:

    $ sudo -u ilo-pce /opt/illumio-pce/illumio-pce-ctl listen-only-mode disable

Note

Explorer will be in maintenance mode for some time after the restore commands complete. The PCE is made available immediately, but the Explorer database restore continues in the background.