Vertical Pod Autoscaling in Kubernetes

Louise | 24 June 2019

In our last blog on autoscaling, we started off by looking at horizontal auto-scaling of Kubernetes pods and how we can allow HPAs to ingest metrics from Prometheus.

In times where additional capacity is needed, horizontal scaling gives us additional copies of the same computational unit. Instead of allowing a single unit to handle more requests, the load is reduced per unit as requests are distributed across a larger set.

When one first thinks about what vertical autoscaling might mean, one would assume that a vertical pod auto-scaler would be an allegory to vertical scaling of a host machine or VM – in other words, increasing the amount of resource on that machine. If a VM is using 4GB of memory and is using 3.8GB, then make an additional 2GB available to that machine.

This might make sense in, for example, the vSphere world where we can set resource pools for VMs - but in the Kubernetes world, this doesn’t quite make sense.

After all, Kubernetes is only a scheduler that sits on top of a host. The Kubelet determines and reports the amount of resource a host has installed, and using these values and the reported resource required by running workloads, the scheduler determines if a workload can fit onto a node.

So what form could vertical pod autoscaling take in theory? Well, since the scheduler can only work effectively if there are reported requests and limits for workloads, setting requests that are true to real-life application usage will allow the scheduler to ensure that that amount of resource within the pool of resources available in the will be guaranteed for use for that application on a specific node. This may prevent workloads from scheduling if there isn’t a hole on any of the cluster nodes that is big enough to fit that application, but again, this guarantees that we can run existing workloads on the cluster without overwhelming available resource and bringing nodes down.

While it’s possible to have VPA and HPA target the same workload, because VPA works exclusively with CPU and memory resources, you shouldn’t use the same metrics for horizontal scaling. Since the VPA and HPA controllers are not aware of each other (at the moment), both controllers may try to apply incompatible changes to the workload. VPA can be complementary to horizontal autoscaling, but you must use assign non-computational metrics to HPA.

With that out of the way, let’s take a dive into how the VPA works.

VPA Modes

VPA has various operational modes to suit depending on how aggressively you would like pods to be updated with new request values:

Auto

Will assign request values both at pod startup and while the pod is live using the specified update mechanism. At the moment, this is equivalent to “recreate”, as there isn’t currently an “in-place” mechanism for updating request values on live pods

Recreate

Will assign request values both at pod startup and, if the current recommended values vary wildly than current request values, the pod will be evicted and a new pod created.

Initial

Will only assign request values when the pod is initially created.

Off

The VPA will continue to generate recommended request values for pods but will defer the application of these values to the cluster operator.

Components of VPA

Unlike the HPA controller, the components which realise vertical pod autoscaling aren’t installed in Kube by default, so we will need to install the components which make up the VPA architecture. There are three controllers that implement vertical pod autoscaling:

Recommender

The initial task required for autoscaling is to ingest metrics and determine the current usage of the workload. Based on current and past metrics for resources, the recommender will determine a “recommended” set of CPU and memory values for each container.

By default, this will be metrics-server. As metrics-server is designed to store metrics in-memory, it only provides metrics for the last 10 minutes. To provide the recommender with a broader history of the running history of the service it is monitoring, you can also plug in historical metrics from a time-series database like Prometheus.

Updater

As detection of what the “correct” requests should be is delegated to the recommender, the updater controller compares the current and recommended requests of each deployment with a delegated vertical pod autoscaler object. If the update mode of the deployment’s corresponding VPA object is set to Auto or Recreate, the updater controller facilitates the creation of new pods that contain new recommended requests. It doesn’t directly update pods with new recommended values, but instructing the Kube API that a particular pod should be evicted from the cluster. It will rely on other controllers in the Kube master plane to take care of the creation of the new pod and making sure the pod has the new desired request values.

Admission Controller

If you’ve already brushed up on what admission controllers are and their purpose, the VPA also includes a mutating admission webhook. If a VPA object’s mode is set to Auto, Recreate or Initial, this webhook will inject the current request values generated by the Recommender at the time a pod is admitted to the cluster.

VPA Object

Let’s take a look at a VPA object:


apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
  annotations:
  name: test-vpa
  namespace: dev
spec:
  resourcePolicy:
    containerPolicies:
    - containerName: '*'
      maxAllowed:
        memory: 1Gi
      minAllowed:
        memory: 500Mi
  targetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: test
  updatePolicy:
    updateMode: Recreate

  

There are a couple of interesting things here.

Using resourcePolicy, we can provide some boundaries on how wild the recommender can vary the CPU and memory resources for the pod by assign minimum and maximum values allowable by the recommender. The use case for how these resources are boundaried depends on how many containers are running in a single pod.

If a pod only has a single container, then setting a wildcard value using containerName: ‘*‘ should be fine. If you want to apply this to pods with sidecars, then, of course, you will need to boundary values on a per container basis. It doesn’t make sense to apply a wildcard value, as this will apply the same memory values for every single pod - and I’m sure you don’t need 500-1000MB for every single container in your pod!

If the recommender controller is able to ingest metrics, in about five-minute internals it will generate and then write recommendations into the status block of the VPA object. Four different bounds are generated:

  • lowerBound: the minimum CPU and memory requests for a container. Not recommended to use this as a baseline for requests

  • Target: the baseline recommended CPU and memory requests for that container

  • upperBound: the maximum recommended CPU and memory requests.

  • uncappedTarget: the recommended CPU and memory requests, but without taking the restrictions defined in ContainerResourcePolicy into consideration.

For example:


status:
  conditions:
  - lastTransitionTime: 2019-06-12T14:29:00Z
    status: "True"
    type: RecommendationProvided
  recommendation:
    containerRecommendations:
    - containerName: test
      lowerBound:
        cpu: 25m
        memory: 262144k
      target:
        cpu: 25m
        memory: 262144k
      uncappedTarget:
        cpu: 25m
        memory: 262144k
      upperBound:
        cpu: 644m
        memory: 1Gi

  

Installing

In order to use VPA, you should be using a version of Kube which supports mutating admission webhooks on the API server. This means a cluster that is at least version 1.9. In addition, mutating webhooks need to be enabled on the API server by including MutatingAdmissionWebhook as a value when defining the –admission-control flag. The ordering of admission controllers using this flag isn’t idempotent, so check the Using Admission Controllers doc.

There are also a couple of VPA-specific requirements:

While VPA version 0.3 requires Kube version 1.9 and over, VPA version 0.4 and 0.5 require a cluster that is version 1.11 and over.

For the recommender to be able to ingest pod metrics, metrics-server also needs to be running on the cluster. As metrics-server is designed to store metrics in-memory, it only provides metrics for the last 10 minutes. To provide the VPA recommender with a broader history of the running history of the service it is monitoring, you can also plug in historical metrics from a time-series database like Prometheus.

Metrics-server isn’t typically installed on the cluster by default, but the easiest way to install it is using Helm:


helm install --name metrics-server --namespace kube-system stable/metrics-server

  

Or conversely, Minikube includes it as a bundled add-on:


minikube addons enable metrics-server

  

You can confirm metrics-server is operating as expecting when you are able to view current resource consumption using kubectl top:


kubectl top nodes
NAME       CPU(cores)   CPU%      MEMORY(bytes)   MEMORY%   
master0   411m         13%       980Mi           17%

  

You can then install the VPA controllers. The official VPA repo has a bash script you can run to install the Kube resources for each controller. Alternatively, you can clone the Helm chart that we’ve made available as made of this blog post:


git clone git@github.com:livewyer-ops/verticalpodautoscaler.git

  

The only thing you may need to configure in the Helm chart is the location of your Prometheus instance. This is applied using the prometheus.url value in values.yaml:


prometheus:
 url: http://prometheus.monitoring.svc

  

Now you can install the Helm chart:


helm install --namespace vpa --namespace kube-system .

  

When you install the helm chart, you will see pods for the three VPA controllers:


kubectl get pods -n kube-system
NAME                                                 READY     STATUS    RESTARTS   AGE
autoscale-vpa-admissioncontroller-74d489d767-hnp9c   1/1       Running   0          26m
autoscale-vpa-recommender-5944df6c7f-4zht4           1/1       Running   0          26m
autoscale-vpa-updater-cd668b489-jqc6b                1/1       Running   0          26m
metrics-server-77fddcc57b-c2mzc                      1/1       Running   3          7d21h

  

There is a secret called vpa-tls-certswhich is mounted into the admission controller that contains a cert bundle.

In the install script, this bundle is generated using a shell script, but in the Helm chart we have the luxury of using the Sprig library and so this processed is scripted using functions:


{{- $altNames := list ( printf "%s.%s" (include "vpa.name" .) .Release.Namespace ) ( printf "%s.%s.svc" (include "vpa.name" .) .Release.Namespace ) -}}
{{- $ca := genCA "vpa-ca" 3650 -}}
{{- $server := genSignedCert ( include "vpa.name" . ) nil $altNames 3650 $ca -}}

  

The ca object generated by genCA will contain a CA cert and key, so we can embed these in place in the Helm template:


 caCert.pem: {{ b64enc $ca.Cert }}
 caKey.pem: {{ b64enc $ca.Key }}

  

The server object will contain a certificate and key signed by the CA we just created and referenced, and so these we can also embed in the Helm template:


 serverCert.pem: {{ b64enc $server.Cert }}
 serverKey.pem: {{ b64enc $server.Key }}

  

We can also confirm that the admission controller is registered as a mutating webhook by checking its log:


I0617 13:36:44.847929       7 v1beta1_fetcher.go:84] Initial VPA v1beta1 synced successfully
I0617 13:36:44.851884       7 config.go:62] client-ca-file=-----BEGIN CERTIFICATE-----
[...]
-----END CERTIFICATE-----
I0617 13:36:54.877379       7 config.go:131] Self registration as MutatingWebhook succeeded.

  

Using VPAs

Once the three controllers are deployed, we are ready to start using VPA.

For this demonstration, we will use the modified php-apache container from the Horizontal Pod Autoscaler Walkthrough. If you’re not familiar with that example, this apache image is modified to create some additional computational load when its index.html is accessed.

First of all, deploy the php-apache image and a service for the php-apache pods. A VPA object is also deployed:


cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
 name: php-apache-vpa
 namespace: dev
spec:
 targetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: php-apache

apiVersion: v1 kind: Service metadata: labels: run: php-apache name: php-apache namespace: dev spec: ports:

  • port: 80 protocol: TCP targetPort: 80 selector: run: php-apache type: ClusterIP

apiVersion: apps/v1 kind: Deployment metadata: generation: 1 labels: run: php-apache name: php-apache namespace: dev spec: progressDeadlineSeconds: 600 replicas: 2 revisionHistoryLimit: 2 selector: matchLabels: run: php-apache strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: labels: run: php-apache spec: containers: - image: gcr.io/google_containers/hpa-example imagePullPolicy: Always name: php-apache resources: requests: cpu: 1m EOF

This should generate a deployment, service, and VPA object for you:


verticalpodautoscaler.autoscaling.k8s.io/php-apache-vpa created
service/php-apache created
deployment.apps/php-apache created

kubectl get pods -n dev NAME READY STATUS RESTARTS AGE php-apache-59759c4b98-rczhn 1/1 Running 0 37s php-apache-59759c4b98-swvcv 1/1 Running 0 37s

In order to use VPA, it seems to be a requirement that a targeted workload run at least two replicas. In my testing, the updater was unable to evict in cases where there was only a single replica deployed, as indicated by this log:


I0617 13:39:48.819919       6 pods_eviction_restriction.go:209] too few replicas for ReplicaSet dev/php-apache-7bfdf49c69. Found 1 live pods

  

In order to demonstrate the usage of the VPA and make it more likely that pod eviction would take place, the pod for this test deployment is assigned a default CPU request value of 1 millicore:


kubectl get pod -n dev -o=custom-columns=NAME:.metadata.name,PHASE:.status.phase,CPU-REQUEST:.spec.containers\[0\].resources.requests.cpu

NAME PHASE CPU-REQUEST php-apache-7bfdf49c69-gf2jd Running 1m php-apache-7bfdf49c69-pjb98 Running 1m

Once the workload is deployed on the cluster, the recommender will detect that there is a new VPA object detected and fetch the metrics available for the pods / containers targetted by this VPA via the metrics API. When this is completed, the recommender will update the status block of the VPA specification with its recommendations:


Kubectl describe vpa php-apache-vpa -n dev

Name: php-apache-vpa Namespace: dev Labels: Annotations: kubectl.kubernetes.io/last-applied-configuration={“apiVersion”:“autoscaling.k8s.io/v1beta2”,“kind”:“VerticalPodAutoscaler”,“metadata”:{“annotations”:{},“name”:“php-apache-vpa”,“namespace”:“dev”},“spec… API Version: autoscaling.k8s.io/v1beta2 Kind: VerticalPodAutoscaler Metadata: Creation Timestamp: 2019-06-17T11:29:25Z Generation: 4 Resource Version: 288396 Self Link: /apis/autoscaling.k8s.io/v1beta2/namespaces/dev/verticalpodautoscalers/php-apache-vpa UID: 2954b910-90f3-11e9-aae4-080027655ff0 Spec: Resource Policy: Container Policies: Container Name: * Max Allowed: Memory: 1Gi Target Ref: API Version: extensions/v1beta1 Kind: Deployment Name: php-apache Update Policy: Update Mode: Recreate Status: Conditions: Last Transition Time: 2019-06-17T11:30:04Z Status: True Type: RecommendationProvided Recommendation: Container Recommendations: Container Name: php-apache Lower Bound: Cpu: 25m Memory: 262144k Target: Cpu: 25m Memory: 262144k Uncapped Target: Cpu: 25m Memory: 262144k Upper Bound: Cpu: 5291m Memory: 1Gi

Because we have sent no load to the php-apache replicas yet, the metrics reported for these pods is minimal, and so the recommended resources will be the minimum that can be set. For CPU this seems to be 25m, and 256MB for memory.

Now let’s send generate some load on php-apache - Open a new terminal window and run a Busybox container in the same namespace:


kubectl run -i --tty load-generator -n dev --image=busybox:1.27 /bin/sh

  

When you have an interactive shell with Busybox, run a looped wget that is targeted towards our php-apache pods. If successful, you should see OK! flooding the output log:


while true; do wget -q -O- http://php-apache.dev.svc.cluster.local; done

  

Wait a few minutes for the recommender to run again, and perform another kubectl describe to check the current recommendation. After a minute or so the load has increased to ~500m:


kubectl describe vpa php-apache -n dev

Name: php-apache-vpa Namespace: dev Labels: Annotations: kubectl.kubernetes.io/last-applied-configuration={“apiVersion”:“autoscaling.k8s.io/v1beta2”,“kind”:“VerticalPodAutoscaler”,“metadata”:{“annotations”:{},“name”:“php-apache-vpa”,“namespace”:“dev”},“spec… API Version: autoscaling.k8s.io/v1beta2 Kind: VerticalPodAutoscaler Metadata: Creation Timestamp: 2019-06-17T15:55:53Z Generation: 4 Resource Version: 329553 Self Link: /apis/autoscaling.k8s.io/v1beta2/namespaces/dev/verticalpodautoscalers/php-apache-vpa UID: 62a76e03-9118-11e9-aae4-080027655ff0 Spec: Target Ref: API Version: apps/v1 Kind: Deployment Name: php-apache Status: Conditions: Last Transition Time: 2019-06-17T15:56:08Z Status: True Type: RecommendationProvided Recommendation: Container Recommendations: Container Name: php-apache Lower Bound: Cpu: 25m Memory: 262144k Target: Cpu: 587m Memory: 262144k Uncapped Target: Cpu: 587m Memory: 262144k Upper Bound: Cpu: 17662m Memory: 664103245 Events:

We can see from the logs for the recommender that when a recommendation is generated, this result is written to the VPA in the form of a patch request on the VPA object on the cluster:


I0617 15:45:08.894873       1 metrics_client.go:69] 30 podMetrics retrieved for all namespaces
I0617 15:45:08.897170       1 cluster_feeder.go:376] ClusterSpec fed with #60 ContainerUsageSamples for #30 containers
I0617 15:45:08.897298       1 recommender.go:183] ClusterState is tracking 30 PodStates and 1 VPAs
I0617 15:45:08.898760       1 request.go:897] Request Body: [{"op":"add","path":"/status","value":{"recommendation":{"containerRecommendations":[{"containerName":"php-apache","target":{"cpu":"627m","memory":"262144k"},"lowerBound":{"cpu":"187m","memory":"262144k"},"upperBound":{"cpu":"46399m","memory":"993517772"},"uncappedTarget":{"cpu":"627m","memory":"262144k"}}]},"conditions":[{"type":"RecommendationProvided","status":"True","lastTransitionTime":"2019-06-17T15:37:08Z"}]}}]
I0617 15:45:08.924479       1 round_trippers.go:405] PATCH https://10.96.0.1:443/apis/autoscaling.k8s.io/v1beta2/namespaces/dev/verticalpodautoscalers/php-apache-vpa 200 OK in 24 milliseconds

  

After a couple of minutes, you should see the pods get recreated. Check the CPU requests on these new pods, it should match the target recommendation:


kubectl get pod -n dev -o=custom-columns=NAME:.metadata.name,PHASE:.status.phase,CPU-REQUEST:.spec.containers\[0\].resources.requests.cpu

NAME PHASE CPU-REQUEST load-generator-66fb94857f-d4q2b Running php-apache-7bfdf49c69-p5klm Running 587m php-apache-7bfdf49c69-xp56w Running 587m

If we check the logs for the updater, we can see that it has noticed there are new VPA-targetted pods, and that these pods have been evicted from the cluster:


I0617 13:36:48.818914       6 api.go:99] Initial VPA synced successfully
I0617 13:36:48.819475       6 reflector.go:131] Starting reflector *v1.Pod (1h0m0s) from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/updater/logic/updater.go:196
I0617 13:36:48.819532       6 reflector.go:169] Listing and watching *v1.Pod from k8s.io/autoscaler/vertical-pod-autoscaler/pkg/updater/logic/updater.go:196
I0617 13:40:48.819590       6 update_priority_calculator.go:118] pod accepted for update php-apache-7bfdf49c69-p86nb with priority 2.62144001099e+11
I0617 13:40:48.819673       6 update_priority_calculator.go:118] pod accepted for update php-apache-7bfdf49c69-8b9m9 with priority 2.62144001099e+11
I0617 13:40:48.819692       6 updater.go:147] evicting pod php-apache-7bfdf49c69-p86nb
I0617 13:40:48.844718       6 updater.go:205] Event(v1.ObjectReference{Kind:"Pod", Namespace:"dev", Name:"php-apache-7bfdf49c69-p86nb", UID:"bbb1f53c-9103-11e9-aae4-080027655ff0", APIVersion:"v1", ResourceVersion:"306480", FieldPath:""}): type: 'Normal' reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.
I0617 13:41:48.819643       6 update_priority_calculator.go:118] pod accepted for update php-apache-7bfdf49c69-8b9m9 with priority 2.62144001099e+11
I0617 13:41:48.819682       6 update_priority_calculator.go:118] pod accepted for update php-apache-7bfdf49c69-j9zl4 with priority 2.62144001099e+11
I0617 13:41:48.819689       6 updater.go:147] evicting pod php-apache-7bfdf49c69-8b9m9
I0617 13:41:48.833147       6 updater.go:205] Event(v1.ObjectReference{Kind:"Pod", Namespace:"dev", Name:"php-apache-7bfdf49c69-8b9m9", UID:"618670f6-9105-11e9-aae4-080027655ff0", APIVersion:"v1", ResourceVersion:"308483", FieldPath:""}): type: 'Normal' reason: 'EvictedByVPA' Pod was evicted by VPA Updater to apply resource recommendation.
I0617 13:42:07.321571       6 reflector.go:357] k8s.io/autoscaler/vertical-pod-autoscaler/pkg/target/v1beta1_fetcher.go:80: Watch close - *v1beta1.VerticalPodAutoscaler total 4 items received

  

Once the pods have been evicted and the replica set that manages these php-apache notices that these pods are missing, it will send a request to the kube-api to create these two pods. These requests will also be noticed by the mutating webhook admission controller. Because these two pods are managed by a VPA with recommendations set, and the VPA is set to allow the admission controller to mutate these pods, the admission controller will inject the recommended resources into the pod spec:


I0617 15:36:20.118442       6 server.go:62] Admitting pod {php-apache-59759c4b98-% php-apache-59759c4b98- dev    0 0001-01-01 00:00:00 +0000 UTC   map[pod-template-hash:59759c4b98 run:php-apache] map[] [{apps/v1 ReplicaSet php-apache-59759c4b98 a76f33fd-9115-11e9-aae4-080027655ff0 0xc0004d8127 0xc0004d8128}] nil [] }
I0617 15:36:20.118961       6 recommendation_provider.go:108] updating requirements for pod php-apache-59759c4b98-%.
I0617 15:36:20.119104       6 recommendation_provider.go:97] Let's choose from 1 configs for pod dev/php-apache-59759c4b98-%
I0617 15:36:20.119156       6 recommendation_provider.go:68] no matching recommendation found for container php-apache
I0617 15:36:20.119224       6 server.go:259] Sending patches: [{add /spec/containers/0/resources {map[] map[]}} {add /spec/containers/0/resources/requests map[]} {add /metadata/annotations map[vpaUpdates:Pod resources updated by php-apache-vpa: container 0: ]}]

  

Admittedly I did have some trouble getting the admission controller to work. I initially suspected that the configmap containing the cert bundle had an invalid configuration, but when I checked the API server log, I noticed that the webhook wasn’t being called - I had a vpa-webhook service in the same namespace as the admission controller pod, but the selectors were misconfigured, so there was no endpoint:


W0617 14:06:49.864288       1 dispatcher.go:70] Failed calling webhook, failing open vpa.k8s.io: failed calling webhook "vpa.k8s.io": Post https://vpa-webhook.kube-system.svc:443/?timeout=30s: dial tcp 10.102.151.220:443: connect: connection refused
E0617 14:06:49.864322       1 dispatcher.go:71] failed calling webhook "vpa.k8s.io": Post https://vpa-webhook.kube-system.svc:443/?timeout=30s: dial tcp 10.102.151.220:443: connect: connection refused

  

Now check the pods altered by the admission controller to check what modifications have taken place. In addition to CPU and memory requests, there is also an annotation that has been added to indicate that this pod has been altered due to VPA. There as well that there are no resource limits set.


kubectl get pod php-apache-59759c4b98-7z86g -n dev -o yaml --export

apiVersion: v1 kind: Pod metadata: annotations: vpaUpdates: ‘Pod resources updated by php-apache-vpa: container 0: cpu request, memory request’ creationTimestamp: null generateName: php-apache-59759c4b98- labels: pod-template-hash: 59759c4b98 run: php-apache ownerReferences:

  • apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: ReplicaSet name: php-apache-59759c4b98 uid: a76f33fd-9115-11e9-aae4-080027655ff0 selfLink: /api/v1/namespaces/dev/pods/php-apache-59759c4b98-7z86g spec: containers:
  • image: gcr.io/google_containers/hpa-example imagePullPolicy: Always name: php-apache resources: requests: cpu: 627m memory: 262144k […]

Interestingly, these changes are not propagated to the deployment that manages these pods:


apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: null
  generation: 1
  labels:
    run: php-apache
  name: php-apache
  selfLink: /apis/extensions/v1beta1/namespaces/dev/deployments/php-apache
spec:
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 2
  selector:
    matchLabels:
      run: php-apache
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        run: php-apache
    spec:
      containers:
      - image: gcr.io/google_containers/hpa-example
        imagePullPolicy: Always
        name: php-apache
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status: {}

  

So these request injections from the admission controller aren’t meant to replace the default requests that you set in your deployment. If you redeploy an application to the cluster multiple times a day, because of the way changes are applied to resources in Kube, it makes sense to not set any requests in the deployment and defer these to the VPA. This way, your current VPA-controller requests will be preserved when the live deployment on the cluster is patched with changes using kubectl apply.

And that brings us to the end of our experimentation with VPAs. If you’ve made it this far we think you deserve a pint! Check back soon for part 3 of our autoscaling series, which will delve into the world of…

Need help running Kubernetes?

Get in touch and see how we can help you.

Contact Us