Overview of the network split approach¶
A quick overview of what and how ocp-network-split does to block network traffic between cluster zones.
Network split firewall script¶
Traffic from zone a
to zone b
is blocked by inserting DROP
rules
for each machine of zone b
into INPUT
and OUTPUT
chains of default
iptables
table on all machines of zone a
via iptables
tool.
This is implemented via network-split.sh
script, which consumes zone
configuration via ZONE_A
, ZONE_B
and ZONE_C
env variables, detects
zone it is running within and applies firewall changes based on the split
configuration which it received from the command line.
Split configuration specifies list of zone tuples, and the network split is made for traffic between each zone tuple. For example:
ab
means that traffic between zonea
andb
will be dropped in both directions (via changes in firewall configuration of zonea
)ab-bc
means that communication in both directions is blocked between zonea
and zoneb
, and also between zoneb
and zonec
One can see what changes will be made via -d
option:
$ export ZONE_A="198.51.100.27"
$ export ZONE_B="198.51.100.175 198.51.100.180 198.51.100.188 198.51.100.198"
$ export ZONE_C="198.51.100.115 198.51.100.192 198.51.100.174 198.51.100.208"
$ ./network-split.sh -d setup ab-ac
ZONE_A="198.51.100.27"
ZONE_B="198.51.100.175 198.51.100.180 198.51.100.188 198.51.100.198"
ZONE_C="198.51.100.115 198.51.100.192 198.51.100.174 198.51.100.208"
current zone: ZONE_A
ab: ZONE_B will be blocked from ZONE_A
iptables -A INPUT -s 198.51.100.175 -j DROP -v
iptables -A OUTPUT -d 198.51.100.175 -j DROP -v
iptables -A INPUT -s 198.51.100.180 -j DROP -v
iptables -A OUTPUT -d 198.51.100.180 -j DROP -v
iptables -A INPUT -s 198.51.100.188 -j DROP -v
iptables -A OUTPUT -d 198.51.100.188 -j DROP -v
iptables -A INPUT -s 198.51.100.198 -j DROP -v
iptables -A OUTPUT -d 198.51.100.198 -j DROP -v
ac: ZONE_C will be blocked from ZONE_A
iptables -A INPUT -s 198.51.100.115 -j DROP -v
iptables -A OUTPUT -d 198.51.100.115 -j DROP -v
iptables -A INPUT -s 198.51.100.192 -j DROP -v
iptables -A OUTPUT -d 198.51.100.192 -j DROP -v
iptables -A INPUT -s 198.51.100.174 -j DROP -v
iptables -A OUTPUT -d 198.51.100.174 -j DROP -v
iptables -A INPUT -s 198.51.100.208 -j DROP -v
iptables -A OUTPUT -d 198.51.100.208 -j DROP -v
Systemd Units¶
The firewall script is not used directly, but through stoppable oneshot
service template network-split@.service
. To use it, we need to chose
particular network split configuration, eg. ab-bc
, and then form so
called “instantiated” service name network-split@ab-ac.service
.
When such “instantiated” service is started, firewall changes to achieve
selected network split are applied and since then systemd is tracking this
service as started. Stopping the service reverts the firewall changes back,
removing the network split. The logs from the firewall script available via
journald as expected.
Example of starting network split for ab-bc
and checking it’s status:
# systemctl start network-split@ab-bc
# systemctl status network-split@ab-bc
● network-split@ab-bc.service - Firewall configuration for a network split
Loaded: loaded (/etc/systemd/system/network-split@.service; disabled; vendor preset: disabled)
Active: active (exited) since Sat 2021-03-06 00:23:18 UTC; 4min 49s ago
Process: 16380 ExecStart=/usr/bin/bash -c /etc/network-split.sh setup ab-bc (code=exited, status=0/SUCCESS)
Main PID: 16380 (code=exited, status=0/SUCCESS)
CPU: 8ms
Mar 06 00:23:18 compute-5 systemd[1]: Starting Firewall configuration for a network split...
Mar 06 00:23:18 compute-5 bash[16380]: ZONE_A="198.51.100.27"
Mar 06 00:23:18 compute-5 bash[16380]: ZONE_B="198.51.100.175 198.51.100.180 198.51.100.188 198.51.100.198"
Mar 06 00:23:18 compute-5 bash[16380]: ZONE_C="198.51.100.115 198.51.100.192 198.51.100.174 198.51.100.208"
Mar 06 00:23:18 compute-5 bash[16380]: current zone: ZONE_C
Mar 06 00:23:18 compute-5 bash[16380]: ab: ZONE_B will be blocked from ZONE_A
Mar 06 00:23:18 compute-5 bash[16380]: bc: ZONE_C will be blocked from ZONE_B
Mar 06 00:23:18 compute-5 systemd[1]: Started Firewall configuration for a network split.
This would work well on a single node, but in our case we need to apply this on multiple machines at the same time. Moreover we also need to make sure that the service is stopped after some time, reverting the network split issue. For this reason, we don’t start the network split service directly, but via systemd timers, which allows us to schedule start and stop of the network split service in advance at the same time on all nodes of the cluster.
For each network split configuration we have in stretch cluster test plan, there is one setup timer template which starts the service at given time:
network-split-ab-ac-setup@.timer
network-split-ab-setup@.timer
network-split-ab-bc-setup@.timer
network-split-bc-setup@.timer
And then single teardown timer template network-split-teardown@.timer
,
which is used to schedule stop of any of the network split services to revert
the firewall changes back into original state.
Parameter of these timer templates is a unix epoch timestamp of the time when
we intend to start or stop the network split, eg.
network-split-teardown@1614990498.timer
.
This is how a network split configuration is applied during test setup, and restored during test teardown.
References:
systemd.service(5) (for details about service templates or example of stoppable oneshot service)
MachineConfig¶
For the approach explained above to work, we need to deploy firewall script,
file with ZONE_{A,B,C}
environment variables and systemd service and timer
units. We achieve this via MachineConfig, which allows us to deploy files in
/etc
directory and system units on all nodes of both master
and
worker
MachineConfigPools.
Using openshift interface has an advantage of better visibility of such changes, which can be easily inspected via machine config operator (MCO) API. Downside of this approach is that MCO is going to drain and reboot every node one by one, which increases time necessary to deploy the configuration.
For this reason, we use MachineConfig only to deploy the script and unit files, while scheduling of the timers to setup and teardown a network split is done via direct connection (using ssh or oc debug) to each node.
References:
Ansible Playbook¶
In multi cluster mode, ansible playbook multisetup-netsplit.yml
is used
to deploy the scripts and systemd unit files mentioned above
to RHEL machines which are part of a zone but outside of any OpenShift cluster.
If multi cluster zones contain both OpenShift nodes and classic RHEL machines outside of any OpenShift cluster, one needs to use both MachineConfig and ansible playbook setup so that the network split scripts are deployed on all nodes of all zones.