Changelog for Kubernetes 1.20

Versions

Kubernetes 1.20.7
Nginx-ingress: 0.46.0
Certmanager: 1.3.1

Major changes

RBAC api rbac.authorization.k8s.io/v1alpha1 has been removed. Instead use the replacement rbac.authorization.k8s.io/v1.
We no longer supports new clusters being created with pod security policy enabled. Instead we recommend using OPA Gatekeeper, in case you have any questions regarding this contact our support and we will help you guys out.
The built-in Cinder Volume Provider has gone from deprecated to disabled. Any volumes that are still using it will have to be migrated, see Known Issues.

Deprecations

Ingress api extensions/v1beta1 will be removed in kubernetes 1.22.
Kubernetes beta lables on nodes are deplricated and will be removed in a future release, follow the below list to se what lable replaces the old one:
- beta.kubernetes.io/instance-type -> node.kubernetes.io/instance-type
- beta.kubernetes.io/arch -> kubernetes.io/arch
- beta.kubernetes.io/os -> kubernetes.io/os
- failure-domain.beta.kubernetes.io/region -> topology.kubernetes.io/region
- failure-domain.beta.kubernetes.io/zone -> topology.kubernetes.io/zone
Certmanager api v1alpha2, v1alpha3 and v1beta1 will be removed in a future release. We strongly recommend that you upgrade to the new v1 api.
RBAC api rbac.authorization.k8s.io/v1beta1 will be removed in an upcoming release. The apis are replaced with rbac.authorization.k8s.io/v1.
Pod Security Policies will be removed in Kubernetes 1.25 in all clusters having the feature enabled. Instead we recommend OPA Gatekeeper.

Is downtime expected

The upgrade drains (moving all workload from) one node at the time, patches that node and brings it back in the cluster. First after all deployments and statefulsets are running again we will continue on with the next node.

Known issues

Custom node taints and labels lost during upgrade

All custom taints and labels on nodes are lost during upgrade.

Custom changes to non -customer security groups will be lost

All changes to security groups not suffixed with “-customer” will be lost during the upgrade

Snapshots is not working

There is currently a limitation in the snapshot controller not making it topology aware.

Volumes using built-in Cinder Volume Provider will be converted

During the upgrade to 1.20 Elastx staff will upgrade any volumes still being managed by the built-in Cinder Volume Provider. No action is needed on the customer side, but it will produce events and possibly log events that may raise concern.

To get a list of Persistent Volumes that are affected you can run this command before the upgrade:

$ kubectl get pv -o json | jq -r '.items[] | select (.spec.cinder != null) | .metadata.name'

Volumes that have been converted will show an event under the Persistent Volume Claim object asserting that data has been lost - this is a false statement and is due to the fact that the underlying Persistent Volume was disconnected for a brief moment while it was being attached to the new CSI-based Cinder Volume Provider.

Bitnami (and possibly other) images and runAsGroup

Some Bitnami images silently assume they are run with the equivalent of runAsGroup: 0. This was the Kubernetes default until 1.20.x. The result is strange looking permission errors on startup and can cause workloads to fail.

At least the Bitnami PostgreSQL and RabbitMQ images have been confirmed as having these issues.

To find out if there are problematic workloads in your cluster you can run the following commands:

    kubectl get pods -A -o yaml|grep image:| sort | uniq | grep bitnami

If any images turn up there may be issues. !NB. Other images may have been built using Bitnami images as base, these will not show up using the above command.

Solution without PSP

On clusters not running PSP it should suffice to just add:

    runAsGroup: 0

To the securityContext for the affected containers.

Solution with PSP

On clusters running PSP some more actions need to be taken. The restricted PSP forbids running as group 0 so a new one needs to be created, such as:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  annotations:
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default,runtime/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: runtime/default
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
  name: restricted-runasgroup0
spec:
  allowPrivilegeEscalation: false
  fsGroup:
    ranges:
    - max: 65535
      min: 1
    rule: MustRunAs
  requiredDropCapabilities:
  - ALL
  runAsGroup:
    ranges:
    - max: 65535
      min: 0
    rule: MustRunAs
  runAsUser:
    rule: MustRunAsNonRoot
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    ranges:
    - max: 65535
      min: 1
    rule: MustRunAs
  volumes:
  - configMap
  - emptyDir
  - projected
  - secret
  - downwardAPI
  - persistentVolumeClaim

Furthermore a ClusterRole allowing the use of said PSP is needed:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
  name: psp:restricted-runasgroup0
rules:
- apiGroups:
  - policy
  resourceNames:
  - restricted-runasgroup0
  resources:
  - podsecuritypolicies
  verbs:
  - use

And finally you need to bind the ServiceAccounts that need to run as group 0 to the ClusterRole with a ClusterRoleBinding:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: psp:restricted-runasgroup0
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: psp:restricted-runasgroup0
subjects:
- kind: ServiceAccount
  name: default
  namespace: keycloak
- kind: ServiceAccount
  name: XXX
  namespace: YYY

Then its just a matter of adding:

runAsGroup: 0

To the securityContext for the affected containers.

Last modified April 22, 2024: added useful options (#171) (7e11b10)