Table of Contents
[warning]
Status: Deprecated
This article is deprecated and no longer maintained.
Reason
The k8s-staticroute-operator has been depreciated. A new DOKS integrated routing agent has been release and should be used instead.
See Instead
Details on installing the new DOKS integrated routing agent can be found at How to Use the Routing Agent in Kubernetes Clusters
Introduction
The main purpose of the Static Routes Operator is to offer greater flexibility and control over network traffic within your Kubernetes environment. It enables you to tailor the routing configuration to meet your application requirements and optimize network performance. It is deployed as a DaemonSet; hence, it will run on each node of your the cloud provider Managed Kubernetes cluster.
In this tutorial, you will learn to manage the routing table of each worker node based on the CRD spec and set up a failing over gateway.
The main goal of this tutorial is to demonstrate how to manage the routing table of each worker node based on the CRD spec and Set up a failing over gateway.
Prerequisites
- Working the cloud provider Managed Kubernetes cluster you have access to.
- Kubectl CLI installed on your local machine (configured to point to your the cloud provider Managed Kubernetes cluster)
- NAT GW Droplets (2 or above) configured and running as detailed in this tutorial How to Configure a Droplet as a Gateway.
- Create a system to detect failures in a gateway Droplet that fits the user's needs and ensures clear and accurate detection with minimal false alarms. Use monitoring services like Prometheus or Nagios, set up health check endpoints on the Droplet, or alerting tools like Alertmanager for notifications. For this purpose, you can use a monitoring stack from our marketplace.
Note: Ensure your NAT Gateway Droplet is created in the same VPC as your Kubernetes cluster.
Below is the architectural Diagram:
Deploying the Kubernetes Static Routes Operator
Deploy the latest release of the static routes operator to your the cloud provider Managed Kubernetes cluster using kubectl:
kubectl apply -f https://www.progressiverobot.com/
[info] Note: You can check the latest version in the releases path from the k8s-staticroute-operator GitHub repo.
Check if Operator Pods are up and running
Let's verify if the operator pods are up and running.
kubectl get staticroutes -o wide -n staticroutes
The output looks similar to the below:
[secondary_label Output]
NAME AGE DESTINATIONS GATEWAY
static-route-ifconfig.me 119s ["XX.XX.XX.XX"] XX.XX.XX.XX
static-route-ipinfo.io 111s ["XX.XX.XX.XX"] XX.XX.XX.XX
Now let's check the operator logs and no exceptions should be reported
kubectl logs -f ds/k8s-staticroute-operator -n static-routes
You should observe the following output:
[secondary_label Output]
Found 2 pods, using pod/k8s-staticroute-operator-498vv
[2023-05-15 14:12:32,282] kopf._core.reactor.r [DEBUG ] Starting Kopf 1.35.6.
[2023-05-15 14:12:32,282] kopf._core.engines.a [INFO ] Initial authentication has been initiated.
[2023-05-15 14:12:32,283] kopf.activities.auth [DEBUG ] Activity 'login_via_pykube' is invoked.
[2023-05-15 14:12:32,285] kopf.activities.auth [DEBUG ] Pykube is configured in cluster with service account.
[2023-05-15 14:12:32,286] kopf.activities.auth [INFO ] Activity 'login_via_pykube' succeeded.
[2023-05-15 14:12:32,286] kopf.activities.auth [DEBUG ] Activity 'login_via_client' is invoked.
[2023-05-15 14:12:32,287] kopf.activities.auth [DEBUG ] Client is configured in cluster with service account.
[2023-05-15 14:12:32,288] kopf.activities.auth [INFO ] Activity 'login_via_client' succeeded.
[2023-05-15 14:12:32,288] kopf._core.engines.a [INFO ] Initial authentication has finished.
[2023-05-15 14:12:32,328] kopf._cogs.clients.w [DEBUG ] Starting the watch-stream for customresourcedefinitions.v1.apiextensions.k8s.io cluster-wide.
[2023-05-15 14:12:32,330] kopf._cogs.clients.w [DEBUG ] Starting the watch-stream for www.progressiverobot.com cluster-wide.
To mitigate the impact of gateway failures, it is advisable to have a standby gateway Droplet prepared for failover when required. Although true high availability (HA) is not supported by the operator at the moment, performing failover helps minimize the duration of service disruption.
Note: Considering all operator instances are up and running correctly at the time of the failover.
Suppose you have a designated destination IP address, 34.160.111.145, which represents the active or primary gateway, with an IP address of 10.116.0.4, responsible for transmitting traffic. This is stored in the primary.yaml file.
[label ./primary.yaml]
apiVersion: www.progressiverobot.com/v1
kind: StaticRoute
metadata:
name: primary
spec:
destinations:
- "34.160.111.145"
gateway: "10.116.0.4"
Additionally, you will have a standby or secondary gateway with an IP address of 10.116.0.12, ready to handle traffic for the same destination IP address. The StaticRoute definition in secondary.yaml is identical to the primary one, except for the gateway IP address (and object name). This is stored in the file secondary.yaml.
[label ./secondary.yaml]
apiVersion: www.progressiverobot.com/v1
kind: StaticRoute
metadata:
name: secondary
spec:
destinations:
- "34.160.111.145"
gateway: "10.116.0.12"
The actual failover procedure then consists of the following steps:
- Identifying that the active gateway with IP address
10.116.0.5is failing. - Delete the currently active
StaticRoute. - Apply the standby
StaticRoute.
Delete the Active StaticRoute
Now let's delete the currently active StaticRoute.
kubectl delete -f primary.yaml
Wait 30 to 60 seconds to give each operator instance enough time to process the object deletion; that is, respond by removing the route from all nodes.
Apply the Standby StaticRoute
Let's make the secondary StaticRoute active.
kubectl apply -f secondary.yaml
The operator should pick up the new standby StaticRoute and enter the corresponding routing table entries. Afterward, the failover is completed.
[warning] Note: Please avoid modifying an existing StaticRoute by directly updating the gateway IP address using commands like kubectl edit staticroute primary to modify only the spec.gateway field. This operation is currently unsupported and may result in failures.
Testing the Setup
Each sample CRD creates a static route to two websites reporting your public IP – ifconfig.me/ip, and ipinfo.io/ip. A typical static route definition looks like the below:
apiVersion: www.progressiverobot.com/v1
kind: StaticRoute
metadata:
name: static-route-ifconfig.me
spec:
destinations:
- "34.160.111.145"
gateway: "10.116.0.5"
To test the setup, download a sample manifest from the example location:
Example for ifconfig.me & ipinfo.io–
curl -O https://www.progressiverobot.com/
curl -O https://www.progressiverobot.com/
After downloading the manifests, replace each manifest file's <> placeholders. Then, apply each manifest using kubectl:
kubectl apply -f static-route-ifconfig.me.yaml
kubectl apply -f static-route-ipinfo.io.yaml
Finally, test if the curl-test pod replies to your NAT Gateway public IP for each route:
kubectl exec -it curl-test -- curl ifconfig.me/ip
kubectl exec -it curl-test -- curl ipinfo.io/ip
You would need to use the same test during the failover testing. During the primary gateway Droplet failure, the result should give NAT GW public IP of the primary Droplet and during the secondary gateway Droplet/failover. The result should give NAT Gateway's public IP of the secondary Droplet.
Troubleshooting
- You need to check the
StaticRouteobject: If an error occurs, first look for errors in the static route event for each node where the rule is applied.
kubectl get StaticRoute <static-route-name> -o yaml
- Check logs: To dig deeper, you can check for errors in the static route operator logs.
kubectl logs -f ds/k8s-staticroute-operator -n static-routes
Clean up
To remove the operator and associated resources, please run the following kubectl command (make sure you're using the same release version as in the install step):
kubectl delete -f deploy https://www.progressiverobot.com/
Note: Above command will also delete the associated namespace (static-routes). Make sure to back up your CRDs first, if needed later.
The output looks similar to:
customresourcedefinition.apiextensions.k8s.io "www.progressiverobot.com" deleted
serviceaccount "k8s-staticroute-operator" deleted
clusterrole.rbac.authorization.k8s.io "k8s-staticroute-operator" deleted
clusterrolebinding.rbac.authorization.k8s.io "k8s-staticroute-operator" deleted
daemonset.apps "k8s-staticroute-operator" deleted
Now, if you test the same curl command, you will get the worker node IP as an output:
kubectl exec -it curl-test -- curl ifconfig.me/ip
kubectl exec -it curl-test -- curl ipinfo.io/ip
Now check the worker node's public IP:
kubectl get nodes -o wide
Conclusion
Implementing failover capabilities, even if true high availability (HA) is not fully supported, is a recommended approach to minimize the impact of gateway failures.
Organizations can significantly reduce the duration of service disruptions by having a standby gateway ready for failover when needed.
It is important to prepare a standby gateway droplet and ensure a smooth transition when failing over. While the implementation may vary depending on specific requirements, prioritizing failover readiness can contribute to maintaining reliable and uninterrupted service delivery.
Next Steps
You can refer to our documentation to configure Droplet as a gateway.
Our official Managed Kubernetes product documentation provides more information on the the cloud provider Managed Kubernetes and its features.
You can contact our sales team to migrate to the cloud provider or talk to our Solution Engineers.