Skip to main content

Illumio Administration Guide 25.4

Manage Multi-Node Traffic Database

You can scale traffic data by sharding it across multiple PCE data nodes. This can be done when first installing the PCE.

You can also expand an existing traffic database to multiple nodes and change the number of nodes as needed. Reasons for doing so include:

  • If you experience performance problems with ingestion or Explorer with a single-node traffic database, these performance issues could be solved by migrating to a multi-node traffic database.

  • A multi-node traffic database may be required if you need to store more data than the single-node traffic database can handle (for example, if you want to store 90 days of data).

Expand the Existing Traffic Database to Multiple Nodes

Use the following steps to reconfigure an existing PCE cluster to scale the traffic database to multiple nodes. The PCE will have to be taken offline for a maintenance window. The duration of this maintenance window depends on the amount of data in the traffic database. For a database of 400GB, the downtime is up to approximately 3 hours.

  1. On any data node, run the following command to back up the traffic database:

    sudo -u ilo-pc e illumio-pce-db-management traffic dump --file trafficdb-backup.tar.gz
  2. On any data node, run the following command to back up the reporting database:

    sudo -u ilo-pc e illumio-pce-db-management report dump --file reportdb-backup.tar.gz
  3. On all new nodes, run the following command to allow multi-node traffic, where the address is the IP address of each new node:

    illumio-pce-ctl cluster-nodes allow <address>
  4. On all nodes, stop the PCE:

    sudo -u ilo-pce illumio-pce-ctl stop
  5. Install the PCE software on the new coordinator and worker nodes, using the same version of the PCE on the existing cluster nodes. There must be exactly two (2) coordinator nodes and two (2) or more pairs of worker nodes.

  6. Please update the runtime_env.yml configuration on every node (both the new ones you just added and those already in the PCE cluster) as follows.

    • Set the cluster type to 4node_dx for a 2x2 PCE or 6node_dx for a 4x2 PCE.

    • In the traffic_datastore section, set num_worker_nodes to the number of worker node pairs. For example, if the PCE cluster has 4 worker nodes, set this parameter to 2.

    • On each coordinator node, in addition to the settings already desribed,set node_type to citus_coordinator.

    • On each worker node, in addition to the settings already desribed, set node_type to citus_worker.

    • If you are using a split-datacenter deployment, set the datacenter parameter on each node to an arbitrary value that indicates what part of the datacenter the node is in.

  7. Check the runtime configuration:

    sudo -u ilo-pce illumio-pce-env check
  8. On all nodes, start the PCE at runlevel 1:

    sudo -u ilo-pce illumio-pce-ctl start --runlevel  1
  9. When the PCE is up and running at level 1, restore the reporting database backup. Run this command on the node where you took the backup.

    sudo -u ilo-pce  illumio-pce-db-management report restore --file pce-reportdb-dump.tar.gz
  10. On one of the coordinator nodes, migrate the traffic database. This will create the database on the coordinator node.

    sudo -u ilo-pce  illumio-pce-db-management traffic migrate
  11. On the node where you took the backup, restore the traffic database backup that you made in step 1:

    sudo -u ilo-pce  illumio-pce-db-management traffic restore --file trafficdb-backup.tar.gz

    When prompted, reply Y if you want to bring the PCE up to runlevel five (5) while the database restore continues in the background. This makes all PCE features except Explorer available immediately, without waiting for the restore to complete.

    If you do not choose to go to runlevel five (5) at this time, you can do so later by running the following command on any node:

    sudo -u ilo-pce  illumio-pce-ctl set-runlevel 5
  12. On any node, check the cluster status:

    sudo -u ilo-pce  illumio-pce-ctl cluster-status -w
  13. When the cluster status is UP and RUNNING, verify successful setup. Log in to the PCE web console and verify that the health of the PCE is good. Check Explorer by running a few queries.

Add or Remove a Worker Node

Use the following steps to add or remove a worker node in a multi-node traffic database. The PCE will have to be taken offline for a maintenance window, the duration of which depends on the amount of data in the traffic database.

Warning

Be sure that the final number of worker nodes is an even number. Worker nodes can only function in groups of two.

  1. On any data node, run the following command to back up the traffic database:

    sudo -u ilo-pce  illumio-pce-db-management traffic dump --file trafficdb_backup.tar.gz
  2. On any node, set the PCE to runlevel 1:

    sudo -u ilo-pce  illumio-pce-ctl set-runlevel  1
  3. When removing a node, run the following command on the node you are removing:

    sudo -u ilo-pce  illumio-pce-ctl cluster-leave
  4. On all nodes, stop the PCE cluster:

    sudo -u ilo-pce  illumio-pce-ctl cluster-stop
  5. On every PCE node, update the value of traffic_datastore.num_worker_nodes in runtime_env.yml. The value should always be twice as large as the number of individual worker nodes, because the worker nodes are configured in pairs.

  6. On all nodes, start the PCE at runlevel 1:

    sudo -u ilo-pce  illumio-pce-ctl start --runlevel 1
  7. On the data node where you took the backup, restore the traffic database backup that you made in step 1:

    sudo -u ilo-pce  illumio-pce-db-management traffic restore --file trafficdb_backup.tar.gz
  8. On any node, set the PCE to runlevel 5:

    sudo -u ilo-pce  illumio-pce-ctl set-runlevel 5
  9. Verify the successful setup. Log in to the PCE web console and verify that the PCE's health is good. Check Explorer by running a few queries.

Back Up and Restore Multi-Node Traffic Database

When your PCE cluster includes a multi-node traffic database, the data size increases, and the standard PCE backup and restore commands consume too much time and resources. To back up and restore multi-node traffic data, use pgbackrest instead.

Database Management Commands for Multi-Node Traffic Database

The following are some useful commands for getting information about a cluster in which the traffic database is distributed to multiple nodes.

To show the worker node configuration:

sudo -u ilo-pce  illumio-pce-db-management traffic citus-worker-metadata

To show the worker primary nodes:

sudo -u ilo-pce  illumio-pce-db-management traffic show-citus-worker-primaries

To show worker replication information:

sudo -u ilo-pce  illumio-pce-db-management traffic show-citus-worker-replication-info