VEN-to-PCE Communication
This topic discusses how the VEN communicates with the PCE for both Illumio Core Cloud customers and Illumio Core On-Premises customers.
Details about VEN-to-PCE Communication
On Prem
The VEN, by default, communicates with the PCE when installed in customers data centers (On-Premises) over the following ports:
Port 8443 - HTTPS requests
Port 8444 - long-lived TLS-over-TCP connection
SaaS
The VEN communicates with the Illumio Core Cloud PCE over Port 443 for both HTTPS requests and the long-lived TLS-over-TCP connection.
The VEN uses Transport Level Security (TLS) to connect to the PCE. The PCE certificate must be trusted by the VEN before communication can occur.
The VEN sends the following details to the PCE:
Regular heartbeat with the latest hostname and other properties of the workload
Traffic log
Network interfaces
Processes
Open ports
Interactive users (Windows only)
Container workload information (C-VEN only)
The VEN receives the following details from the PCE:
Firewall policy
Lightning bolts/heartbeat responses with action to perform, such as sending a support report
Configurable Time for Heartbeat Warning
You can change the threshold for the time the VEN goes without a heartbeat and goes into the Warning state. To change the 15-minute threshold in the PCE interface:
Go to Settings > Offline Timers.
Click Edit.
In the section Disconnect and Quarantine Warning, select Custom Timeout.
Specify a wait time.
Click Save.
VEN Connectivity
Online: The workload is connected to the network and can communicate with the PCE.
Offline: The workload is not connected to the network and cannot communicate with the PCE.
Suspended: The VEN is in the suspended state and any rules programmed into the workload's IP tables (including custom iptables rules) or Windows filtering platform firewalls are removed completely. No Illumio-related processes are running on the workload.
VEN Support for IPv6 Traffic
You can configure how VENs support IPv6 traffic. Go to Settings > Security and click the General tab:
For VEN releases 20.2.0 and later, choose one of these options:
Allow IPv6 traffic according to your policy
Block IPv6 traffic only when in Full Enforcement. (Traffic will always be allowed on AIX and Solaris workstations.)
For VEN releases pre-20.2.0, choose one of these options:
Allow all IPv6 traffic
Block IPv6 traffic only when in Full Enforcement. (Traffic will always be allowed on AIX and Solaris workstations.)
Communication Frequency
The following table shows the frequency of communications to the PCE for common VEN operations. See PCE Administration Guide for more details about these intervals and their effects.
Function | Frequency | Notes |
---|---|---|
Firewall policy updates | Real-time if lightning bolts are enabled. | If lightning bolts are displayed or the channel is not functional, policy updates are communicated to the VEN by a heartbeat action. |
Active service reporting | See note. |
|
Interface reports and changes | Event driven. | Only if there are changes to the interfaces; otherwise, no data are sent. |
Traffic flow log | Every 10 minutes. |
|
Heartbeat | Every 5 minutes. | If the PCE does not receive three consecutive heartbeats, an event is written to the PCE's event log. See also VEN Heartbeats and Lost Agents. |
Dead-peer interval | Configurable | Default is 60 minutes (or 12 heartbeats). See also VEN Offline Timers and Isolation. |
VEN tampering detection | Within a few seconds on Windows and Linux. | For more information, see Host Firewall Tampering Protection. |
VEN Heartbeats and Lost Agents
The VEN sends a heartbeat message every five minutes to the PCE to inform the PCE that it is up and running. If the VEN fails to send a heartbeat, check the workload where the VEN is installed and investigate any connectivity issues. If the VEN continues to fail to send a heartbeat, it eventually is marked Offline, which means it can no longer communicate with the PCE or other managed workloads.
PCE down or network issue and the VEN degraded state
If the VEN cannot connect to the PCE either because the PCE is down or because of a network issue, the VEN continues to enforce the last-known-good policy while it tries to reconnect with the PCE.
After missing three heartbeats, the VEN enters the degraded state. In the degraded state, the VEN ignores all the asynchronous commands received as lightning bolts from the PCE, except the commands for software upgrades and support reports.
After connectivity to the PCE is restored, the VEN comes out of the degraded state after three successful heartbeats.
Failed authentication and the VEN minimal state
If the VEN enters the degraded state because of failed authentications, the VEN enters a state called minimal. In the minimal state, the VEN only attempts to connect with the PCE every four hours through a heartbeat.
If the authentication failure was temporary, the VEN exits the minimal state after its first successful connection to the PCE. Whenever the VEN enters the minimal state, it stops the VTAP service. VTAP is then restarted when the VEN exits the minimal state.
If Kerberos authentication is used, the VEN attempts to refresh the agent token with a new Kerberos ticket before sending a heartbeat. If the authentication error is not recovered after four hours, the VEN sends a lost-agent message to the PCE which then logs a message in the Organization Events. The message informs the user that the VEN needs to be uninstalled or reinstalled manually on this workload.
VEN Offline Timers and Isolation
When the VEN on a workload is stopped, the VEN makes a "best effort" REST API goodbye call to the PCE. After a delay specified by the "workload goodbye timer" (a default of 15 minutes), the PCE marks the workload offline and removes it from the policy.
If the REST API call (goodbye) fails, or if the workload goes offline abruptly (for example, due to a power outage), the PCE stops receiving heartbeats from the workload. After the period of time configured in the PCE web console Settings > Offline Timers elapses, the PCE marks the workload offline and recomputes policies for the peer workloads to isolate the offline workload. If no time period has been configured, the default is 60 minutes, or 12 heartbeats.
The system_task.agent_missed_heartbeats_check
alert triggers an alert to be sent at 25% of the time configured in the offline timer. For example, if the offline timer is configured to 1 hour, an alert is sent after the VEN has not sent a heartbeat for 15 minutes; if the offline timer is configured to 4 hours, an alert is sent after the VEN hasn't sent a heartbeat for 1 hour. If a user has customized the timer, the event will show up when 25% of the timer has elapsed.
Sampling Mode for VENs
If the VEN receives a sustained amount of high traffic per second from many individual connections, the VEN enters Sampling Mode to reduce the load. Sampling Mode is a protection mechanism to ensure that the VEN does not contribute to the consumption of CPU. In Sampling Mode, not every flow is reported. Instead, flows are periodically sampled and logged.
After CPU usage on the VEN decreases, Sampling Mode is disabled and each connection is reported to the VEN. The entry and exit from sampling-mode is automatically performed by the VEN depending on the load on the VEN.
Details about entering and exiting Sampling Mode are captured in /opt/illumio_ven_data/log/vtap.log
. Look for Entering
and Exiting throttle state
.
Linux TCP Timeout Variable
For VENs installed on Linux workloads, the VEN relies on conntrack to manage the nf_conntrack_tcp_timeout_established
variable.
By default, as soon as the VEN is installed, it sets the nf_conntrack_tcp_timeout_established
frequency to eight hours (28,800 seconds). Setting this frequency manages workload memory by removing unused connections from the table and thereby increasing performance.
If you change the frequency via sysctl
, it is reverted the next time the workload is rebooted or the next time the VEN's configuration file is read.
Wireless Connections and VPNs
The Illumio Core VEN supports wireless connections for VENs installed on endpoints in the Illumio Core.
For more information about installing the VEN on an endpoint, and supporting a wireless network connection, see the Endpoint Installation and Usage Guide.
Note
Wireless network support is only available for endpoints in Illumio Core. It is not available for other support server types, such as bare-metal servers, virtual machines (VMs), or container hosts.
Show Amount of Data Transfer
The operation of 'show amount of data transfer' capability on the PCE is a preview feature available with the 20.2.0 release. The PCE now reports amount of data transferred in to and out of workloads and applications in a datacenter. The number of bytes sent by and received by the provider of an application are provided separately. These values can be seen in traffic flow summaries streamed out of the PCE. This capability can be enabled on a per-workload basis in the Workload page. It can also be enabled in the pairing profile so that workloads are directly paired into this mode.
After the feature is enabled, the VEN starts reporting the number of bytes transferred over the connections. The PCE collects this data, adds relevant information, such as, labels and sends the traffic flow summaries out of the PCE.
The direction reported in flow summary is from the viewpoint of the provider of the flow.
Destination Total Bytes Out (
dst_tbo
): Number of bytes transferred out of provider (Connection Responder)Destination Total Bytes In (
dst_tbi
): Number of bytes transferred in to provider (Connection Responder)
The number of bytes includes:
L3 and L4 header sizes of each packet (IP Header and TCP Header)
Sizes of multiple headers that may be included in communication (when SecureConnect is enabled)
Retransmitted packets.
The bytes transferred in the packets of a connection are included in measurement. This is similar to various networking products such as firewalls, span-port measurement tools, and other network traffic measurement tools that measure network traffic.
Term | Description |
---|---|
dst_tbi | Destination Total Bytes In Total bytes received till now by the destination over the flows included in this flow-summary in the latest sampled interval. This is the same as bytes sent by the source. Present in 'A', 'C', and 'T' flow-summaries. source = client = connection initiator, destination = server = connection responder. |
dst_tbo | Destination Total Bytes Out Total bytes sent till now by the destination over the flows included in this flow-summary in the latest sampled interval. This is the same as bytes received by the source. Present in 'A', 'C', and 'T' flow-summaries. source = client = connection initiator, destination = server = connection responder. |
dst_dbi | Destination Delta Bytes In Number of bytes received by the destination in the latest sampled interval, over the flows included in this flow-summary. This is the same as bytes sent by the source. Present in 'A', 'C', and 'T' flow-summaries. source = client = connection initiator, destination = server = connection responder. |
dst_dbo | Destination Delta Bytes Out Number of bytes sent by the destination in the latest sampled interval, over the flows included in this flow-summary. This is the same as bytes received by the source. Present in 'A', 'C', and 'T' flow-summaries. source = client = connection initiator, destination = server = connection responder. |
interval_sec T | Time Interval in Seconds Duration of latest sampled interval over which the above metrics are valid. |
Connection State | Description |
---|---|
A | Active: The connection is still active at the time the record was posted. Typically observed with long-lived flows on source and destination side of communication. |
T | Timed Out: Flow does not exist any more. It has timed out. Typically observed on destination side of communication. |
C | Closed: Flow does not exist any more. It has been closed. Typically observed on source side of communication. |
S | Snapshot: Connection was active at the time VEN sampled the flow. Typically observed when the VEN is in Idle state. |