Manage Multi-Node Traffic Database
You can scale traffic data by sharding it across multiple PCE data nodes. This can be done when first installing the PCE. You can also expand an existing traffic database to multiple nodes and change the number of nodes as needed. Reasons for doing so include:
If you experience performance problems with ingestion or Explorer with a single-node traffic database, these performance issues could be solved by migrating to a multi-node traffic database.
If you need to store more data than the single-node traffic database can handle (for example, if you want to store 90 days of data), a multi-node traffic database may be required.
Expand Existing Traffic Database to Multiple Nodes
To reconfigure an existing PCE cluster to scale the traffic database to multiple nodes, use the following steps. The PCE will have to be taken offline for a maintenance window. The duration of this maintenance window depends on the amount of data in the traffic database. For a database of 400GB, the downtime is up to approximately 3 hours.
On any data node, run the following command to back up the traffic database:
sudo -u ilo-pc e illumio-pce-db-management traffic dump --file trafficdb-backup.tar.gz
On any data node, run the following command to back up the reporting database:
sudo -u ilo-pc e illumio-pce-db-management report dump --file reportdb-backup.tar.gz
On all new nodes, run the following command to allow multi-node traffic, where the address is the IP address of each new node:
illumio-pce-ctl cluster-nodes allow <address>
On all nodes, stop the PCE:
sudo -u ilo-pce illumio-pce-ctl stop
Install the PCE software on the new coordinator and worker nodes, using the same version of the PCE that is present on the existing nodes in the cluster. There must be exactly two (2) coordinator nodes. There must be two (2) or more pairs of worker nodes.
Update the
runtime_env.yml
configuration on every node (the new ones you just added as well as the ones that were already in the PCE cluster) as follows.Set the cluster type to
4node_dx
for a 2x2 PCE or6node_dx
for a 4x2 PCE.In the
traffic_datastore
section, setnum_worker_nodes
to the number of worker node pairs. For example, if the PCE cluster has 4 worker nodes, set this parameter to 2.On each coordinator node, in addition to the settings already desribed, set
node_type
tocitus_coordinator
.On each worker node, in addition to the settings already desribed, set
node_type
tocitus_worker
.If you are using a split-datacenter deployment, set the
datacenter
parameter on each node to an arbitrary value that indicates what part of the datacenter the node is in.
Check the runtime configuration:
sudo -u ilo-pce illumio-pce-env check
On all nodes, start the PCE at runlevel 1:
sudo -u ilo-pce illumio-pce-ctl start --runlevel 1
When the PCE is up and running at level 1, restore the reporting database backup. Run this command on the node where you took the backup.
sudo -u ilo-pce illumio-pce-db-management report restore --file pce-reportdb-dump.tar.gz
On one of the coordinator nodes, migrate the traffic database. This will create the database on the coordinator node.
sudo -u ilo-pce illumio-pce-db-management traffic migrate
On the node where you took the backup, restore the traffic database backup that you made in step 1:
sudo -u ilo-pce illumio-pce-db-management traffic restore --file trafficdb-backup.tar.gz
When prompted, reply Y if you want to bring the PCE up to runlevel 5 while the database restore continues in the background. This makes all PCE features except Explorer available immediately, without having to wait for the restore to complete.
If you do not choose to go to runlevel 5 at this time, you can do so later by running the following command on any node:
sudo -u ilo-pce illumio-pce-ctl set-runlevel 5
On any node, check the cluster status:
sudo -u ilo-pce illumio-pce-ctl cluster-status -w
When the cluster status is UP and RUNNING, verify successful setup. Log in to the PCE web console and verify that the health of the PCE is good. Check Explorer by running a few queries.
Add or Remove a Worker Node
To add or remove a worker node in a multi-node traffic database, use the following steps. The PCE will have to be taken offline for a maintenance window. The duration of this maintenance window depends on the amount of data in the traffic database.
Warning
Be sure that the final number of worker nodes is an even number. Worker nodes can only function in groups of two.
On any data node, run the following command to back up the traffic database:
sudo -u ilo-pce illumio-pce-db-management traffic dump --file trafficdb_backup.tar.gz
On any node, set the PCE to runlevel 1:
sudo -u ilo-pce illumio-pce-ctl set-runlevel 1
When removing a node, run the following command on the node you are removing:
sudo -u ilo-pce illumio-pce-ctl cluster-leave
On all nodes, stop the PCE cluster:
sudo -u ilo-pce illumio-pce-ctl cluster-stop
On every PCE node, update the value of traffic_datastore.num_worker_nodes in runtime_env.yml. The value should always be twice as large as the number of individual worker nodes, because the worker nodes are configured in pairs.
On all nodes, start the PCE at runlevel 1:
sudo -u ilo-pce illumio-pce-ctl start --runlevel 1
On the data node where you took the backup, restore the traffic database backup that you made in step 1:
sudo -u ilo-pce illumio-pce-db-management traffic restore --file trafficdb_backup.tar.gz
On any node, set the PCE to runlevel 5:
sudo -u ilo-pce illumio-pce-ctl set-runlevel 5
Verify successful setup. Log in to the PCE web console and verify that the health of the PCE is good. Check Explorer by running a few queries.
Back Up and Restore Multi-Node Traffic Database
When your PCE cluster includes a multi-node traffic database, the data size increases, and the standard PCE backup and restore commands consume too much time and resources. To back up and restore multi-node traffic data, use pgbackrest
instead. For more information, see Using pgbackrest for Traffic Data Backups.
Database Management Commands for Multi-Node Traffic Database
Following are some useful commands to get information about a cluster where the traffic database is distributed to multiple nodes.
To show the worker node configuration:
sudo -u ilo-pce illumio-pce-db-management traffic citus-worker-metadata
To show worker primary nodes:
sudo -u ilo-pce illumio-pce-db-management traffic show-citus-worker-primaries
To show worker replication information:
sudo -u ilo-pce illumio-pce-db-management traffic show-citus-worker-replication-info