This is the multi-page printable view of this section. Click here to print.
Guides
- 1: Auto Healing
- 2: Auto Scaling
- 3: Cert-manager and Cloudflare demo
- 4: Change PV StorageClass
- 5: Ingress and cert-manager
- 6: Install and upgrade cert-manager
- 7: Install and upgrade ingress-nginx
- 8: Load balancers
- 9: Persistent volumes
- 10: Kubernetes API whitelist
1 - Auto Healing
In our Kubernetes Services, we have implemented a robust auto-healing mechanism to ensure the high availability and reliability of our infrastructure. This system is designed to automatically manage and replace unhealthy nodes, thereby minimizing downtime and maintaining the stability of our services.
Auto-Healing Mechanism
Triggers
-
Unready Node Detection:
- The auto-healing process is triggered when a node remains in an “not ready” or “unknown” state for 15 minutes.
- This delay allows for transient issues to resolve themselves without unnecessary node replacements.
-
Node Creation Failure:
- To ensure new nodes are given adequate time to initialize and join the cluster, we have configured startup timers:
- Control Plane Nodes:
- A new control plane node has a maximum startup time of 30 minutes. This extended period accounts for the critical nature and complexity of control plane operations.
- Worker Nodes:
- A new worker node has a maximum startup time of 10 minutes, reflecting the relatively simpler setup process compared to control plane nodes.
- Control Plane Nodes:
- To ensure new nodes are given adequate time to initialize and join the cluster, we have configured startup timers:
Actions
- Unresponsive Node:
- Once a node is identified as unready for the specified duration, the auto-healing system deletes the old node.
- Simultaneously, it initiates the creation of a new node to take its place, ensuring the cluster remains properly sized and functional.
Built-in Failsafe
To prevent cascading failures and to handle scenarios where multiple nodes become unresponsive, we have a built-in failsafe mechanism:
- Threshold for Unresponsive Nodes:
- If more than 35% of the nodes in the cluster become unresponsive simultaneously, the failsafe activates.
- This failsafe blocks any further changes, as such a widespread issue likely indicates a broader underlying problem, such as network or platform-related issues, rather than isolated node failures.
By integrating these features, our Kubernetes Services can automatically handle node failures and maintain high availability, while also providing safeguards against systemic issues. This auto-healing capability ensures that our infrastructure remains resilient, responsive, and capable of supporting continuous service delivery.
2 - Auto Scaling
We now offer autoscaling of nodes.
What is a nodegroup?
In order to simplify node management we now have nodegroup.
A nodegroup is a set of nodes, They span over all 3 of our availability zones. All nodes in a nodegroup are using the same flavour. This means if you want to mix flavours in your cluster there will be at least one nodegroup per flavor. We can also create custom nodegroups upon requests meaning you can have 2 nodegroups with the same flavour.
By default clusters are created with one nodegroup called “worker”.
When listing nodes by running kubectl get nodes you can see the node group by looking at the node name. All node names begin with clustername - nodegroup.
In the example below we have the cluster hux-lab1 and can see the default workers are located in the nodegroup worker and additionally, the added nodegroup nodegroup2 with a few extra nodes.
❯ kubectl get nodes
NAME STATUS ROLES AGE VERSION
hux-lab1-control-plane-c9bmm Ready control-plane 2d18h v1.27.3
hux-lab1-control-plane-j5p42 Ready control-plane 2d18h v1.27.3
hux-lab1-control-plane-wlwr8 Ready control-plane 2d18h v1.27.3
hux-lab1-worker-447sn Ready <none> 2d18h v1.27.3
hux-lab1-worker-9ltbp Ready <none> 2d18h v1.27.3
hux-lab1-worker-htfbp Ready <none> 15h v1.27.3
hux-lab1-worker-k56hn Ready <none> 16h v1.27.3
hux-lab1-nodegroup2-33hbp Ready <none> 15h v1.27.3
hux-lab1-nodegroup2-54j5k Ready <none> 16h v1.27.3
How to activate autoscaling?
Autoscaling currently needs to be configured by Elastx support.
In order to activate auto scaling we need to know clustername and nodegroup with two values for minimum/maximum number of desired nodes. Currently we have a minimum set to 3 nodes however this is subject to change in the future.
Nodes are split into availability zones meaning if you want 3 nodes you get one in each availability zone.
Another example is to have a minimum of 3 nodes and maximum of 7. This would translate to minimum one node per availability zone and maximum 3 in STO1 and 2 in STO2 and STO3 respectively. To keep it simple we recommend using increments of 3.
If you are unsure contact out support and we will help you get the configuration you wish for.
How does autoscaling know when to add additional nodes?
Nodes are added when they are needed. There are two scenarios:
- You have a pod that fails to be scheduled on existing nodes
- Scheduled pods requests more then 100% of any resource, this method is smart and senses the amount of resources per node and can therfor add more than one node at a time if required.
When does the autoscaler scale down nodes?
The autoscaler removes nodes when it senses there is enough free resources to accomodate all current workload (based on requests) on fewer nodes. To avoid all nodes having 100% resource requests (and thereby usage), there is also a built-in mechanism to ensure there is always at least 50% of a node available resources to accept additional requests.
Meaning if you have a nodegroup with 3 nodes and all of them have 4 CPU cores you need to have a total of 2 CPU cores that is not requested per any workload.
To refrain from triggering the auto-scaling feature excessively, there is a built in delay of 10 minutes for scale down actions to occur. Scale up events are triggered immediately.
Can I disable auto scaling after activating it?
Yes, just contact Elastx support and we will help you with this.
When disabling auto scaling node count will be locked. Contact support if the number if nodes you wish to keep deviates from current amount od nodes, and we will scale it for you.
3 - Cert-manager and Cloudflare demo
In this guide we will use a Cloudflare managed domain and a our own cert-manager to provide LetsEncrypt certificates for a test deployment.
The guide is suitable if you have a domain connected to a single cluster, and would like a to issue/manage certificates from within kubernetes. The setup below becomes Clusterwider, meaning it will deploy certificates to any namespace specifed.
Prerequisites
- DNS managed on Cloudflare
- Cloudflare API token
- Installed cert-manager. See our guide here.
- Installed IngressController. See our guide here.
Setup ClusterIssuer
Create a file to hold the secret of your api token for your Cloudflare DNS. Then create the ClusterIssuer configuration file adapted for Cloudflare.
apiVersion: v1
kind: Secret
metadata:
name: cloudflare-api-token
namespace: cert-manager
type: Opaque
stringData:
api-token: "<your api token>"
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: cloudflare-issuer
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: <your email>
privateKeySecretRef:
name: cloudflare-issuer-key
solvers:
- dns01:
cloudflare:
email: <your email>
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
kubectl apply -f cloudflare-issuer.yml
The clusterIssuer is soon ready. Example output:
kubectl get clusterissuers.cert-manager.io
NAME READY AGE
cloudflare-issuer True 6d18h
Expose a workload and secure with Let’s encrypt certificate
In this section we will setup a deployment, with it’s accompanying service and ingress object. The ingress object will request a certificate for test2.domain.ltd, and once fully up and running, should provide https://test2.domain.ltd with a valid letsencrypt certificate.
We’ll use the created ClusterIssuer and let cert-manager request new certificates for any added ingress object. This setup requires the “*” record setup in the DNS provider.
This is how the DNS is setup in this particular example:
A A record (“domain.ltd”) points to the loadbalancer IP of the cluster.
A CNAME record refers to ("*") and points to the A record above.
This example also specifies the namespace “echo2”.
apiVersion: apps/v1
kind: Deployment
metadata:
name: echo2-dep
namespace: echo2
spec:
selector:
matchLabels:
app: echo2
replicas: 1
template:
metadata:
labels:
app: echo2
spec:
containers:
- name: echo2
image: hashicorp/http-echo
args:
- "-text=echo2"
ports:
- containerPort: 5678
securityContext:
runAsUser: 1001
fsGroup: 1001
---
apiVersion: v1
kind: Service
metadata:
labels:
app: echo2
name: echo2-service
namespace: echo2
spec:
ports:
- protocol: TCP
port: 5678
targetPort: 5678
selector:
app: echo2
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: echo2-ingress
namespace: echo2
annotations:
cert-manager.io/cluster-issuer: cloudflare-issuer
kubernetes.io/ingress.class: "nginx"
spec:
ingressClassName: nginx
tls:
- hosts:
- test2.domain.ltd
secretName: test2-domain-tls
rules:
- host: test2.domain.ltd
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: echo5-service
port:
number: 5678
The DNS challenge and certificate issue process takes a couple of minutes. You can follow the progress by watching:
kubectl events -n cert-manager
Once completed, it shall all be accessible at http://test2.domain.ltd
4 - Change PV StorageClass
This guide details all steps to change storage class of a volume. The instruction can be used to migrate from one storage class to another, while retaining data. For example from 8kto v2-4k.
Prerequisites
- Access to the kubernetes cluster
- Access to Openstack kubernetes Project
Preparation steps
-
Populate variables
Complete with relevant names for your setup. Then copy/paste them into the terminal to set them as environment variables that will be used throughout the guide. PVC is the
PVC=test1 NAMESPACE=default NEWSTORAGECLASS=v2-1k -
Fetch and populate the PV name by running:
PV=$(kubectl get pvc -n $NAMESPACE $PVC -o go-template='{{.spec.volumeName}}') -
Create backup of PVC and PV configurations
Fetch the PVC and PV configurations and store in /tmp/ for later use:
kubectl get pvc -n $NAMESPACE $PVC -o yaml | tee /tmp/pvc.yaml kubectl get pv $PV -o yaml | tee /tmp/pv.yaml -
Change VolumeReclaimPolicy
To avoid deletion of the PV when deleting the PVC, the volume needs to have VolumeReclaimPolicy set to Retain.
Patch:
kubectl patch pv $PV -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' -
Stop pods from accessing the mounted volume (ie kill pods/scale statefulset/etc..).
-
Delete the PVC.
kubectl delete pvc -n "$NAMESPACE" "$PVC"
Login to Openstack
-
Navigate to: Volumes -> Volumes
-
Make a backup of the volume From the drop-down to the right, select backup. The backup is good practice, not used in the following steps.
-
Change the storage type to desired type. The volume should now or shortly have status Available. Dropdown to the right, Edit volume -> Change volume type:
- Select your desired storage type
- Select Migration policy=Ondemand
The window will close, and the volume will be updated and migrated (to the v2 storage platform) if necessary, by the backend. The status becomes “Volume retyping”. Wait until completed.
We have a complementary guide here.
Back to kubernetes
-
Release the tie between PVC and PV
The PV is still referencing its old PVC, in the
claimRef, found under spec.claimRef.uid. This UID needs to be nullified to release the PV, allowing it to be adopted by a PVC with correct storageClass.Patch claimRef to null:
kubectl patch pv "$PV" -p '{"spec":{"claimRef":{"namespace":"'$NAMESPACE'","name":"'$PVC'","uid":null}}}' -
The PV StorageClass in kubernetes does not match to its counterpart in Openstack.
We need to patch the storageClassName reference in the PV:
kubectl patch pv "$PV" -p '{"spec":{"storageClassName":"'$NEWSTORAGECLASS'"}}' -
Prepare a new PVC with the updated storageClass
We need to modify the saved /tmp/pvc.yaml.
-
Remove “last-applied-configuration”:
sed -i '/kubectl.kubernetes.io\/last-applied-configuration: |/ { N; d; }' /tmp/pvc.yaml -
Update existing storageClassName to the new one:
sed -i 's/storageClassName: .*/storageClassName: '$NEWSTORAGECLASS'/g' /tmp/pvc.yaml
-
-
Apply the updated /tmp/pvc.yaml
kubectl apply -f /tmp/pvc.yaml -
Update the PV to bind with the new PVC
We must allow the new PVC to bind correctly to the old PV. We need to first fetch the new PVC UID, then patch the PV with the PVC UID so kubernetes understands what PVC the PV belongs to.
-
Retrieve the new PVC UID:
PVCUID=$(kubectl get -n "$NAMESPACE" pvc "$PVC" -o custom-columns=UID:.metadata.uid --no-headers) -
Patch the PV with the new UID of the PVC:
kubectl patch pv "$PV" -p '{"spec":{"claimRef":{"uid":"'$PVCUID'"}}}'
-
-
Reset the Reclaim Policy of the volume to Delete:
kubectl patch pv $PV -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}' -
Completed.
- Verify the volume works healthily.
- Update your manifests to reflect the new storageClassName.
5 - Ingress and cert-manager
Follow along demo
In this piece, we show all steps to expose a web service using an Ingress resource. Additionally, we demonstrate how to enable TLS, by using cert-manager to request a Let’s Encrypt certificate.
Prerequisites
- A DNS record pointing at the public IP address of your worker nodes. In the examples all references to the domain example.ltd must be replaced by the domain you wish to issue certificates for. Configuring DNS is out of scope for this documentation.
- For clusters created on or after Kubernetes 1.26 you need to ensure there is a Ingress controller and cert-manager installed.
Create resources
Create a file called ingress.yaml with the following content:
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: my-web-service
name: my-web-service
spec:
replicas: 3
selector:
matchLabels:
app: my-web-service
template:
metadata:
labels:
app: my-web-service
spec:
securityContext:
runAsUser: 1001
fsGroup: 1001
containers:
- image: k8s.gcr.io/serve_hostname
name: servehostname
ports:
- containerPort: 9376
---
apiVersion: v1
kind: Service
metadata:
labels:
app: my-web-service
name: my-web-service
spec:
ports:
- port: 9376
protocol: TCP
targetPort: 9376
selector:
app: my-web-service
type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-web-service-ingress
annotations:
cert-manager.io/issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- hosts:
- example.tld
secretName: example-tld
rules:
- host: example.tld
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-web-service
port:
number: 9376
Then create the resources in the cluster by running:
kubectl apply -f ingress.yaml
Run kubectl get ingress and you should see output similar to this:
NAME CLASS HOSTS ADDRESS PORTS AGE
my-web-service-ingress nginx example.tld 91.197.41.241 80, 443 39s
If not, wait a while and try again. Once you see output similar to the above you should be able to reach your service at http://example.tld.
Exposing TCP services
If you wish to expose TCP services note that the tcp-services is located in the default namespace in our clusters.
Enabling TLS
A simple way to enable TLS for your service is by requesting a certificate using the Let’s Encrypt CA. This only requires a few simple steps.
Begin by creating a file called issuer.yaml with the following content:
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: letsencrypt-prod
spec:
acme:
# Let's Encrypt ACME server for production certificates
server: https://acme-v02.api.letsencrypt.org/directory
# This email address will get notifications if failure to renew certificates happens
email: valid-email@example.tld
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
Replace the email address with your own. Then create the Issuer in the cluster
by running:
kubectl apply -f issuer.yaml
Next edit the file called ingress.yaml from the previous example and make sure
the Ingress resource matches the example below:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-web-service-ingress
annotations:
cert-manager.io/issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- hosts:
- example.tld
secretName: example-tld
rules:
- host: example.tld
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-web-service
port:
number: 9376
Make sure to replace all references to example.tld by your own domain. Then
update the resources by running:
kubectl apply -f ingress.yaml
Wait a couple of minutes and your service should be reachable at https://example.tld with a valid certificate.
Network policies
If you are using network policies you will need to add a networkpolicy that allows traffic from the ingress controller to the temporary pod that performs the HTTP challenge. With the default NGINX Ingress Controller provided by us this policy should do the trick.
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: letsencrypt-http-challenge
spec:
policyTypes:
- Ingress
podSelector:
matchLabels:
acme.cert-manager.io/http01-solver: "true"
ingress:
- ports:
- port: http
from:
- namespaceSelector:
matchLabels:
app.kubernetes.io/name: ingress-nginx
Advanced usage
For more advanced use cases please refer to the documentation provided by each project or contact our support:
6 - Install and upgrade cert-manager
Starting at Kubernetes version v1.26, our default configured clusters are delivered without cert-manager.
This guide will assist you get a working up to date cert-manager and provide instructions for how to upgrade and delete it. Running your own is useful if you want to have full control.
The guide is based on cert-manager Helm chart, found here. We draw advantage of the option to install CRDs with kubectl, as recommended for a production setup.
Prerequisites
Helm needs to be provided with the correct repository:
-
Setup helm repo
helm repo add jetstack https://charts.jetstack.io --force-update -
Verify you do not have a namespace named
elx-cert-manageras you first need to remove some resources.kubectl -n elx-cert-manager delete svc cert-manager cert-manager-webhook kubectl -n elx-cert-manager delete deployments.apps cert-manager cert-manager-cainjector cert-manager-webhook kubectl delete namespace elx-cert-manager
Install
-
Prepare and install CRDs run:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.4/cert-manager.crds.yaml -
Run Helm install:
helm install \ cert-manager jetstack/cert-manager \ --namespace cert-manager \ --create-namespace \ --version v1.14.4 \A full list of available Helm values is on cert-manager’s ArtifactHub page.
-
Verify the installation: Done with cmctl (cert-manager CLI https://cert-manager.io/docs/reference/cmctl/#installation).
cmctl check apiIf everything is working you should get this message
The cert-manager API is ready.
Upgrade
The setup used above is referenced in the topic “CRDs managed separately”.
In these examples <version> is “v1.14.4”.
-
Update CRDS:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/<version>/cert-manager.crds.yaml -
Update the Helm chart:
helm upgrade cert-manager jetstack/cert-manager --namespace cert-manager --version v1.14.4
Uninstall
To uninstall, use the guide here.
7 - Install and upgrade ingress-nginx
This guide will assist you get a working up to date ingress controller and provide instructions for how to upgrade and delete it. Running your own is useful if you want to have full control.
The guide is based on on ingress-nginx Helm chart, found here.
Prerequisites
Helm needs to be provided with the correct repository:
-
Setup helm repo
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx -
Make sure to update repo cache
helm repo update
Generate values.yaml
We provide settings for two main scenarios of how clients connect to the cluster. The configuration file, values.yaml, must reflect the correct scenario.
-
Customer connects directly to the Ingress:
controller: kind: DaemonSet metrics: enabled: true service: enabled: true annotations: loadbalancer.openstack.org/proxy-protocol: "true" ingressClassResource: default: true publishService: enabled: false allowSnippetAnnotations: true config: use-proxy-protocol: "true" defaultBackend: enabled: true -
Customer connects via Proxy:
controller: kind: DaemonSet metrics: enabled: true service: enabled: true #loadBalancerSourceRanges: # - <Proxy(s)-CIDR> ingressClassResource: default: true publishService: enabled: false allowSnippetAnnotations: true config: use-forwarded-headers: "true" defaultBackend: enabled: true -
Other useful settings:
For a complete set of options see the upstream documentation here.
[...] service: loadBalancerSourceRanges: # Whitelist source IPs. - 133.124.../32 - 122.123.../24 annotations: loadbalancer.openstack.org/keep-floatingip: "true" # retain floating IP in floating IP pool. loadbalancer.openstack.org/flavor-id: "v1-lb-2" # specify flavor. [...]
Install ingress-nginx
Use the values.yaml generated in the previous step.
helm install ingress-nginx ingress-nginx/ingress-nginx --values values.yaml --namespace ingress-nginx --create-namespace
Example output:
NAME: ingress-nginx
LAST DEPLOYED: Tue Jul 18 11:26:17 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the Load Balancer IP to become available.
You can watch the status by running 'kubectl --namespace default get services -o wide -w ingress-nginx-controller'
[..]
Upgrade ingress-nginx
Use the values.yaml generated in the previous step.
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx --values values.yaml --namespace ingress-nginx
Example output:
Release "ingress-nginx" has been upgraded. Happy Helming!
NAME: ingress-nginx
LAST DEPLOYED: Tue Jul 18 11:29:41 2023
NAMESPACE: default
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the Load Balancer IP to be available.
You can watch the status by running 'kubectl --namespace default get services -o wide -w ingress-nginx-controller'
[..]
Remove ingress-nginx
The best practice is to use the helm template method to remove the ingress. This allows for proper removal of lingering resources, then remove the namespace. Use the values.yaml generated in the previous step.
Note: Avoid running multiple ingress controllers using the same
IngressClass.
See more information here.
-
Run the delete command
helm template ingress-nginx ingress-nginx/ingress-nginx --values values.yaml --namespace ingress-nginx | kubectl delete -f - -
Remove the namespace if necessary
kubectl delete namespace ingress-nginx
8 - Load balancers
Load balancers in our Elastx Kubernetes CaaS service are provided by OpenStack Octavia in collaboration with the Kubernetes Cloud Provider OpenStack. This article will introduce some of the basics of how to use services of service type LoadBalancer to expose service using OpenStack Octavia load balancers. For more advanced use cases you are encouraged to read the official documentation of each project or contacting our support for assistance.
A quick example
Exposing services using a service with type LoadBalancer will give you an unique public IP backed by an OpenStack Octavia load balancer. This example will take you through the steps for creating such a service.
Create the resources
Create a file called lb.yaml with the following content:
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: echoserver
name: echoserver
spec:
replicas: 3
selector:
matchLabels:
app.kubernetes.io/name: echoserver
template:
metadata:
labels:
app.kubernetes.io/name: echoserver
spec:
containers:
- image: gcr.io/google-containers/echoserver:1.10
name: echoserver
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: echoserver
name: echoserver
annotations:
loadbalancer.openstack.org/x-forwarded-for: "true"
loadbalancer.openstack.org/flavor-id: 552c16df-dcc1-473d-8683-65e37e094443
spec:
ports:
- port: 80
protocol: TCP
targetPort: 8080
name: http
selector:
app.kubernetes.io/name: echoserver
type: LoadBalancer
Then create the resources in the cluster by running:
kubectl apply -f lb.yaml
You can watch the load balancer being created by running:
kubectl get svc
This should output something like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
echoserver LoadBalancer 10.233.32.83 <pending> 80:30838/TCP 6s
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 10h
The
We can investigate further by running:
kubectl describe svc echoserver
Output should look something like this:
Name: echoserver
Namespace: default
Labels: app.kubernetes.io/name=echoserver
Annotations: loadbalancer.openstack.org/x-forwarded-for: true
Selector: app.kubernetes.io/name=echoserver
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.233.32.83
IPs: 10.233.32.83
Port: <unset> 80/TCP
TargetPort: 8080/TCP
NodePort: <unset> 30838/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 115s service-controller Ensuring load balancer
Looking at the Events section near the bottom we can see that the Cloud Controller has picked up the order and is provisioning a load balancer.
Running the same command again (kubectl describe svc echoserver) after waiting
some time should produce output like:
Name: echoserver
Namespace: default
Labels: app.kubernetes.io/name=echoserver
Annotations: loadbalancer.openstack.org/x-forwarded-for: true
Selector: app.kubernetes.io/name=echoserver
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.233.32.83
IPs: 10.233.32.83
LoadBalancer Ingress: 91.197.41.223
Port: <unset> 80/TCP
TargetPort: 8080/TCP
NodePort: <unset> 30838/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 8m52s service-controller Ensuring load balancer
Normal EnsuredLoadBalancer 6m43s service-controller Ensured load balancer
Again looking at the Events section we can tell that the Cloud Provider has provisioned the load balancer for us (the EnsuredLoadBalancer event). Furthermore we can see the public IP address associated with the service by checking the LoadBalancer Ingress.
Finally to verify that the load balancer and service are operational run:
curl http://<IP address from LoadBalancer Ingress>
Your output should look something like:
Hostname: echoserver-84655f4656-sc4k6
Pod Information:
-no pod information available-
Server values:
server_version=nginx: 1.13.3 - lua: 10008
Request Information:
client_address=10.128.0.3
method=GET
real path=/
query=
request_version=1.1
request_scheme=http
request_uri=http://91.197.41.223:8080/
Request Headers:
accept=*/*
host=91.197.41.223
user-agent=curl/7.68.0
x-forwarded-for=213.179.7.4
Request Body:
-no body in request-
Things to note:
- You do not need to modify security groups when exposing services using load balancers.
- The client_address is the address of the load balancer and not the client making the request, you can find the real client address in the x-forwarded-for header.
- The x-forwarded-for header is provided by setting the
loadbalancer.openstack.org/x-forwarded-for: "true"on the service. Read more about available annotations in the Advanced usage section.
Advanced usage
For more advanced use cases please refer to the documentation provided by each project or contact our support:
Good to know
Load balancers are billable resources
Adding services of type LoadBalancer will create load balancers in OpenStack, which is a billable resource and you will be charged for them.
Loadbalancer statuses
Load balancers within OpenStack have two distinct statuses, which may cause confusion regarding their meanings:
- Provisioning Status: This status reflects the overall condition of the load balancer itself. If any issues arise with the load balancer, this status will indicate them. Should you encounter any problems with this status, please don’t hesitate to contact Elastx support for assistance.
- Operating Status: This status indicates the health of the configured backends, typically referring to the nodes within your cluster, especially when health checks are enabled (which is the default setting). It’s important to note that an operational status doesn’t necessarily imply a problem, as it depends on your specific configuration. If a service is only exposed on a single node, for instance, this is to be expected since load balancers by default distribute traffic across all cluster nodes.
Provisioning status codes
| Code | Description |
|---|---|
| ACTIVE | The entity was provisioned successfully |
| DELETED | The entity has been successfully deleted |
| ERROR | Provisioning failed |
| PENDING_CREATE | The entity is being created |
| PENDING_UPDATE | The entity is being updated |
| PENDING_DELETE | The entity is being deleted |
Operating status codes
| Code | Description |
|---|---|
| ONLINE | - Entity is operating normally - All pool members are healthy |
| DRAINING | The member is not accepting new connections |
| OFFLINE | Entity is administratively disabled |
| DEGRADED | One or more of the entity’s components are in ERROR |
| ERROR | -The entity has failed - The member is failing it’s health monitoring checks - All of the pool members are in ERROR |
| NO_MONITOR | No health monitor is configured for this entity and it’s status is unknown |
High availability properties
OpenStack Octavia load balancers are placed in two of our three availability zones. This is a limitation imposed by the OpenStack Octavia project.
Reconfiguring using annotations
Reconfiguring the load balancers using annotations is not as dynamic and smooth as one would hope. For now, to change the configuration of a load balancer the service needs to be deleted and a new one created.
Loadbalancer protocols
Loadbalancers have support for multiple protocols. In general we would recommend everyone to try avoiding http and https simply because they do not perform as well as other protocols.
Instead use tcp or haproxys proxy protocol and run an ingress controller thats responsible for proxying within clusters and TLS.
Load Balancer Flavors
Load balancers come in multiple flavors. The biggest difference is how much traffic they can handle. If no flavor is deployed, we default to v1-lb-1. However, this flavor can only push around 200 Mbit/s. For customers wanting to push potentially more, we have a couple of flavors to choose from:
| ID | Name | Specs | Approx Traffic |
|---|---|---|---|
| 16cce6f9-9120-4199-8f0a-8a76c21a8536 | v1-lb-1 | 1G, 1 CPU | 200 Mbit/s |
| 48ba211c-20f1-4098-9216-d28f3716a305 | v1-lb-2 | 1G, 2 CPU | 400 Mbit/s |
| b4a85cd7-abe0-41aa-9928-d15b69770fd4 | v1-lb-4 | 2G, 4 CPU | 800 Mbit/s |
| 1161b39a-a947-4af4-9bda-73b341e1ef47 | v1-lb-8 | 4G, 8 CPU | 1600 Mbit/s |
To select a flavor for your Load Balancer, add the following to the Kubernetes Service .metadata.annotations:
loadbalancer.openstack.org/flavor-id: <id-of-your-flavor>
Note that this is a destructive operation when modifying an existing Service; it will remove the current Load Balancer and create a new one (with a new public IP).
Full example configuration for a basic LoadBalancer service:
apiVersion: v1
kind: Service
metadata:
annotations:
loadbalancer.openstack.org/flavor-id: b4a85cd7-abe0-41aa-9928-d15b69770fd4
name: my-loadbalancer
spec:
ports:
- name: http-80
port: 80
protocol: TCP
targetPort: http
selector:
app: my-application
type: LoadBalancer
9 - Persistent volumes
Persistent volumes in our Elastx Kubernetes CaaS service are provided by OpenStack Cinder. Volumes are dynamically provisioned by Kubernetes Cloud Provider OpenStack.
Storage classes
8k refers to 8000 IOPS.
See our pricing page under the table Storage to calculate your costs.
Below is the list of storage classes provided in newly created clusters. In case you see other storageclasses in your cluster, consider these legacy and please migrate data away from them. We provide a guide to Change PV StorageClass.
$ kubectl get storageclasses
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
v2-128k cinder.csi.openstack.org Delete WaitForFirstConsumer true 27d
v2-16k cinder.csi.openstack.org Delete WaitForFirstConsumer true 27d
v2-1k (default) cinder.csi.openstack.org Delete WaitForFirstConsumer true 27d
v2-32k cinder.csi.openstack.org Delete WaitForFirstConsumer true 27d
v2-4k cinder.csi.openstack.org Delete WaitForFirstConsumer true 27d
v2-64k cinder.csi.openstack.org Delete WaitForFirstConsumer true 27d
v2-8k cinder.csi.openstack.org Delete WaitForFirstConsumer true 27d
Example of PersistentVolumeClaim
A quick example of how to create an unused 1Gi persistent volume claim named example:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: example
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 1Gi
storageClassName: v2-16k
$ kubectl get persistentvolumeclaim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
example Bound pvc-f8b1dc7f-db84-11e8-bda5-fa163e3803b4 1Gi RWO v2-16k 18s
Good to know
Cross mounting of volumes between nodes
Cross mounting of volumes is not supported! That is a volume can only be mounted by a node residing in the same availability zone as the volume. Plan accordingly for ensured high availability!
Limit of volumes and pods per node
In case higher number of volumes or pods are required, consider adding additional worker nodes.
| Kubernetes version | Max pods/node | Max volumes/node |
|---|---|---|
| v1.25 and lower | 110 | 25 |
| v1.26 and higher | 110 | 125 |
Encryption
All volumes are encrypted at rest in hardware.
Volume type hostPath
A volume of type hostPath is in reality just a local directory on the specific node being mounted in a pod, this means data is stored locally and will be unavailable if the pod is ever rescheduled on another node. This is expected during cluster upgrades or maintenance, however it may also occur because of other reasons, for example if a pod crashes or a node is malfunctioning. Malfunctioning nodes are automatically healed, meaning they are automatically replaced.
You can read more about hostpath here.
If you are looking for a way to store persistent data we recommend to use PVCs. PVCs can move between nodes within one data-center meaning any data stored will be present even if the pod or node is recreated.
Known issues
Resizing encrypted volumes
Legacy: encrypted volumes do not resize properly, please contact our support if you wish to resize such a volume.
10 - Kubernetes API whitelist
In our Kubernetes Services, we rely on Openstack loadbalancers in front of the control planes to ensure traffic will be sent to a functional node. Whitelisting of access to the API server is now controlled in the loadbalancer in front of the API. Currently, managing the IP-range whitelist requires a support ticket here.
Please submit a ticket with CIDR/ranges for the ip’s you wish to whitelist. We are happy to help you ASAP.
Note: All Elastx IP ranges are always included.
In the future, we expect to have this functionality available self-service style.