Single Cluster Usage¶
Assumptions about cluster zones¶
A k8s zone is a set of cluster nodes with the same value of k8s label key
topology.kubernetes.io/zone
, see an example of zone data-1
:
$ oc get nodes -l topology.kubernetes.io/zone=data-1
NAME STATUS ROLES AGE VERSION
compute-0 Ready worker 7d14h v1.20.0+bafe72f
compute-1 Ready worker 7d14h v1.20.0+bafe72f
compute-2 Ready worker 7d14h v1.20.0+bafe72f
control-plane-0 Ready master 7d14h v1.20.0+bafe72f
We assume that there are 3 zones in the cluster, and that every node belongs to some zone, eg:
$ oc get nodes -L topology.kubernetes.io/zone
NAME STATUS ROLES AGE VERSION ZONE
compute-0 Ready worker 8d v1.20.0+bafe72f data-1
compute-1 Ready worker 8d v1.20.0+bafe72f data-1
compute-2 Ready worker 8d v1.20.0+bafe72f data-1
compute-3 Ready worker 8d v1.20.0+bafe72f data-2
compute-4 Ready worker 8d v1.20.0+bafe72f data-2
compute-5 Ready worker 8d v1.20.0+bafe72f data-2
control-plane-0 Ready master 8d v1.20.0+bafe72f data-1
control-plane-1 Ready master 8d v1.20.0+bafe72f data-2
control-plane-2 Ready master 8d v1.20.0+bafe72f arbiter
There is no limitation on the design of cluster zones or their names
(values of topology.kubernetes.io/zone
label key). The ocp-network-split
references zones under single letter names (such as a
, b
… see
ocpnetsplit.zone.ZONES
), so that you will just need to
create mapping between ocp-network-split names and actual zone names as shown
in the following sections.
External zone¶
Besides normal cluster zones, there is a special zone x
which represents
external services running outside of a cluster. Specifying list of IP addresses
for x
zone allows you to block traffic to these IP addresses in both
directions later.
Command line tools¶
There are also 2 command line tools:
ocp-network-split-setup
: based on given zone name assignment, it fetches IP addresses of all nodes for every zone (to create env file with zone configuration), and createsMachineConfig
yaml file to deploy the zone configuration along with firewall script and systemd unit files to every node of the cluster. This is done only once.ocp-network-split-sched
: schedules given network split configuration which will start at given time and stop after given number of minutes.
Setting up network split¶
Let’s have a look how the zone configuration generated by the setup script looks like (the example also shows how to define zone name mapping):
$ ocp-network-split-setup -a arbiter -b data-1 -c data-2 --print-env-only
ZONE_A="198.51.100.36"
ZONE_B="198.51.100.127 198.51.100.158 198.51.100.160 198.51.100.163"
ZONE_C="198.51.100.103 198.51.100.162 198.51.100.65 198.51.100.98"
If this looks good, we can go on and create MachineConfig
yaml file, which
you can inspect as well.
$ ocp-network-split-setup -a arbiter -b data-1 -c data-2 -o network-split.yaml
$ head network-split.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: master
name: 99-master-network-split
spec:
config:
ignition:
version: 3.1.0
Then you can use oc create
to deploy the configuration:
$ oc create -f network-split.yaml
machineconfig.machineconfiguration.openshift.io/95-master-network-zone-config created
machineconfig.machineconfiguration.openshift.io/99-master-network-split created
machineconfig.machineconfiguration.openshift.io/95-worker-network-zone-config created
machineconfig.machineconfiguration.openshift.io/99-worker-network-split created
Note that there are 2 MachineConfig
resources for each node type:
network-zone-config provides zone configuration and can be shared with latency
machine config (see bellow) while network-split provides firewall split
scripts.
Introducing additional network latency¶
If you need to configure additional artificial network latency between nodes
from different cluster zones, you can specify the desired one way latency in
milliseconds via --latency
option.
In the following example, we are using extremely large number of 106 ms for demonstration purposes, which will give us full round trip latency of 212 ms:
$ ocp-network-split-setup -a arbiter -b data-1 -c data-2 --latency 106 -o split-latency.yaml
$ oc create -f split-latency.yaml
machineconfig.machineconfiguration.openshift.io/95-master-network-zone-config created
machineconfig.machineconfiguration.openshift.io/99-master-network-latency created
machineconfig.machineconfiguration.openshift.io/99-master-network-split created
machineconfig.machineconfiguration.openshift.io/95-worker-network-zone-config created
machineconfig.machineconfiguration.openshift.io/99-worker-network-latency created
machineconfig.machineconfiguration.openshift.io/99-worker-network-split created
The additional latency is configured via systemd service which is enabled to start during boot, so that the latency is effective almost immediately and will remain applied even after node reboot.
The only way to remove it is to delete it’s machineconfig resources.
Scheduling network split¶
When the machine config is applied (check oc get mcp
if both pools are
updated), we can schedule 5 minute long network split of particular
configuration ab
(cutting connection between zones a
and b
) at
given time:
$ ocp-network-split-sched ab -t 2021-04-09T16:30 --split-len 5
When the time details are omitted, the sched script will just list net split timers for given split configuration on all nodes. In the following example, we can see one split was schedule 26 minutes ago, while another is going to happen in about 4 minutes:
$ ocp-network-split-sched ab
node/compute-0
NEXT LEFT LAST PASSED UNIT ACTIVATES
Fri 2021-04-09 14:30:00 UTC 3min 50s left n/a n/a network-split-ab-setup@1617978600.timer network-split@ab.service
n/a n/a Fri 2021-04-09 14:00:00 UTC 26min ago network-split-ab-setup@1617976800.timer network-split@ab.service
node/compute-1
NEXT LEFT LAST PASSED UNIT ACTIVATES
Fri 2021-04-09 14:30:00 UTC 3min 48s left n/a n/a network-split-ab-setup@1617978600.timer network-split@ab.service
n/a n/a Fri 2021-04-09 14:00:00 UTC 26min ago network-split-ab-setup@1617976800.timer network-split@ab.service
... rest of the output is ommited ...
You can schedule multiple splits in advance, or wait for one network split to end before going on with another one.