Configure default optimization settings

Define default settings at the workload, namespace, and cluster levels by using annotations

Learn how to:

Note: You can learn more about how Optimize Live generates recommendations in the Concepts topic.

Key Points

  • Use the Optimize Live UI to configure optimization settings for one workload at a time, which can be helpful in testing workloads.

  • Use annotations to configure optimization settings at scale at the workload, namespace, and cluster levels.

    This topic doesn’t discuss annotations in detail. If you plan to use annotations, take a few minutes to read the Configure by using annotations topic — it lists the supported annotations and will help you to understand where to place annotations to configure settings at the workload, namespace, and cluster levels.

  • Workloads whose settings are managed in any way by annotations become view-only in the UI.

    To edit the workload in the UI, you must remove the cluster-defaults ConfigMap, namespace-level annotations, and workload-level annotations that pertain to the workload.

Automatically deploy recommendations

By default, Optimize Live doesn’t apply (deploy) recommendations automatically. See the best practices below.

Best practices:

  • Review and selectively apply the first few recommendations to see what happens. Then, as you trust recommendations more, you can enable them to be applied automatically on more workloads.
  • To reduce pod churn from having Optimize Live apply recommendations that make only small changes, you can set deployment thresholds. Optimize Live will apply a recommendation only if that threshold is met.
Steps

Check whether the StormForge Applier is installed. On the command line, run:

helm ls -a -n stormforge-system

If you don’t see the stormforge-applier-2.*.* chart listed in the CHART column of the output, install it.

Next, configure auto-deployment using one of these methods:
UI:

  1. In the left navigation, click Optimize Live > Workloads, then find and click the name of the workload you want to work with.
  2. On the workload details page, click Config, and set Automatic Deployment to On.
  3. Click Update to save your changes.

Annotations:
Decide at which level to set the value and annotate accordingly (refer to Configure by using annotations for details):

  • Workload or namespace level: Annotate the workload YAML manifest or the namespace YAML manifest with the live.stormforge.io/auto-deploy annotation. Example: live.stormforge.io/auto-deploy: "true"
  • Cluster level: Add the autoDeploy: "VALUE" parameter:value pair to the clusterDefaultConfig values in a cluster-defaults.yaml file. Example: autoDeploy: "true".

Set a schedule

Specify how often you want Optimize Live to generate recommendations for a workload.

Best practices:

  • Generating recommendations once daily (default value) is considered a best practice in stable or production environments.

  • Review recommendations manually the first few times before enabling auto-deployment, so that you can see how recommendations affect your workloads.

    When you’re ready to have Optimize Live apply scheduled recommendations automatically, be sure to install the StormForge Applier and enable automatic deployment.

Key points:

  • Short intervals (such as hourly or every few hours) provide close tracking of utilization with quick-changing short-lived recommendations. Useful when in automatic deployment mode.
  • Longer intervals (such as weekly) will produce longer-lived recommendations, which is useful for integrating with slower-moving CI or applying recommendations manually.
Steps

To specify how often recommendations are generated and how long they’re valid for:
UI:

  1. In the left navigation, click Optimize Live > Workloads, then find and click the name of the workload you want to work with.
  2. On the Workload page, click Config.
  3. In the Recommendation Schedule section, specify how often you want to receive recommendations.
  4. To save your changes, click Update.

Annotations:
Use macros, ISO 8601 Duration strings like "P1D", or Cron format to specify a schedule. Decide at which level to set the value and annotate accordingly (refer to Configure by using annotations for details):

  • Workload or namespace level: Annotate the workload YAML manifest or the namespace YAML manifest with the live.stormforge.io/schedule annotation. Example: live.stormforge.io/schedule: "@daily"
  • Cluster level: Add the schedule: "VALUE" parameter:value pair to the clusterDefaultConfig values in a cluster-defaults.yaml file. Example: schedule: "P1D".

Set thresholds to skip applying recommendations that propose small changes (set deployment thresholds)

Optional. Set the smallest amount of change to requests that make it worthwhile to deploy new recommendations. Setting thresholds can help to reduce pod churn.

You can specify a percent-change threshold, a unit-change threshold, or both.

  • Percent-change is the smallest percent change to requests that make it worthwhile to deploy the recommendation.
  • Unit-change is the smallest amount of change (in the units you specify) that make it worthwile to deploy the recommendation.

Key points:

  • Available only to workloads that have automatic deployment enabled (auto-deploy=true).

  • Important: Thresholds are not preserved when you disable automatic deployment on a workload. If you need to enforce a standard thresholds across workloads in your organization, consider creating a YAML file that defines default values.

  • If you enable a threshold, you must specify an positive integer value for both CPU and memory.

  • You can set these for individual workloads, but setting them at the cluster level provides the most value.

  • Unit changes work well for small workloads, and percent changes work well for larger workloads.

  • A recommendation is applied to a workload if, for at least one container, the recommendation satisfies all enabled thresholds.

    Example: Suppose you enable both the percent-change and unit-change thresholds for a workload. A recommendation will be applied if either (or both) of the following is true:

    • The recommendation satisfies both the CPU percent- and unit-change threshold values.
    • The recommendation satisfies both the memory percent- and unit-change threshold values.
Steps

To configure thresholds:
UI:

  1. In the left navigation, click Optimize Live > Workloads, then find and click the name of the workload you want to work with.
  2. On the Workload page, click Config, and set Automatic Deployment to on if it isn’t already set.
  3. Enable either the percent-change or unit-change threshold, or both, based on the amount of change you want to enforce.
  4. For CPU and Memory, set the smallest amount of change to requests that would make it worthwhile to deploy the recommendation.
  5. To save your changes, click Update.

To remove a threshold, click its toggle to the off position and then click Update.

Annotations:
Decide at which level to set the value and annotate accordingly (refer to Configure by using annotations for details). Remember to include the live.stormforge.io/auto-deploy: "true" annotation.

More detailed examples are shown in the Configure by using annotations topic.

  • Workload or namespace level: Annotate the workload YAML manifest or the namespace YAML manifest with one or both of the following pairs of annotations. Replace the sample values with the values you need.

    Percent-change:

    live.stormforge.io/auto-deploy.thresholds.cpu.percent: "5"
    live.stormforge.io/auto-deploy.thresholds.memory.percent: "10"
    

    Unit-change:

    live.stormforge.io/auto-deploy.thresholds.cpu.unit: "10m"
    live.stormforge.io/auto-deploy.thresholds.memory.unit: "10Mi"
    
  • Cluster level: Add either or both of the following parameter:value pairs to the clusterDefaultConfig values in a cluster-defaults.yaml file. Remember to include the autoDeploy: "true" pair. Replace the sample values with the values you need.

    Percent-change:

    autoDeployThresholdsCpuPercent: "5"
    autoDeployThresholdsMemoryPercent: "10"
    

    Unit-change:

    autoDeployThresholdsCpuUnit: "10m"
    autoDeployThresholdsMemoryUnit: "10Mi"  
    

CLI:
Run stormforge edit workload handlers with either or both pairs of arguments. Remember to include the --auto-deploy (or equivalent --autodeploy=true) argument. Replace the VALUE and VALUE_AND_UNIT with the values you need.

Percent-change arguments:

  --auto-deploy-thresholds-min-percent-change-cpu=VALUE    
  --auto-deploy-thresholds-min-percent-change-memory=VALUE

Unit-change arguments:

--auto-deploy-thresholds-min-unit-change-cpu=VALUE_AND_UNIT
--auto-deploy-thresholds-min-unit-change-memory=VALUE_AND_UNIT

Examples:

  • Apply only recommendations that change CPU requests by at least 5% for at least one container. Suppose for this example, memory requests don’t matter to you - you must still set the memory threshold to a positive integer value.

    stormforge edit workload WORKLOAD_NAME handlers \
    --auto-deploy=true \
    --auto-deploy-thresholds-min-percent-change-cpu=5 \
    --auto-deploy-thresholds-min-percent-change-memory=1 \
    
  • Apply only recommendations that satisfy either (or both) of the following conditions for at least one container:

    • CPU requests change by at least 5% and 10m

    • Memory requests change by at least 10% and 10Mi

      stormforge edit workload WORKLOAD_NAME handlers \
      --auto-deploy=true \
      --auto-deploy-thresholds-min-percent-change-cpu=5 \
      --auto-deploy-thresholds-min-percent-change-memory=10 \
      --auto-deploy-thresholds-min-unit-change-cpu=10m \
      --auto-deploy-thresholds-min-unit-change-memory=10Mi 
      

To remove a threshold, run stormforge edit workload handlers with the appropriate threshold values set to 0. For example, to remove the unit-change threshold, run the following command:

stormforge edit workload WORKLOAD_NAME handlers \
  --auto-deploy=true \
  --auto-deploy-thresholds-min-unit-change-cpu=0 \
  --auto-deploy-thresholds-min-unit-change-memory=0 

To see additional arguments (for example, for applying these settings to all workloads in a cluster or in a specific workspace), see stormforge edit workloads handlers or run stormforge edit workloads handlers --help.

Configure CPU and memory optimization goals

Based on your risk profile, choose a value for the CPU optimization goal and memory optimization goal:

  • savings: Provides recommendations that are closer to actual CPU or memory usage. Consider this option when you want to maximize your resource savings.
  • balanced: Default value. Provides a balanced approach to achieving cost savings and increased reliability.
  • reliability: Minimizes the risk of hitting CPU or memory limits. Consider this option for business-critical applications.

You can set different CPU and memory optimization goals. For example, if your organization can tolerate throttling when containers exceed CPU limits but cannot tolerate restarts when containers exceed memory limits, you can set a savings goal for CPU and a reliability goal for memory.

Steps

To configure optimization goals:
UI:

  1. In the left navigation, click Optimize Live > Workloads, then find and click the name of the workload you want to work with.
  2. On the Workload page, click Config and choose your optimization goal.
  3. To save your changes, click Update.

Annotations:
Decide at which level to set the value and annotate accordingly (refer to Configure by using annotations for details):

  • Workload or namespace level: Annotate the workload YAML manifest or the namespace YAML manifest with one or both of the following annotations. Example:
    live.stormforge.io/cpu.optimization-goal: "reliability"
    live.stormforge.io/memory.optimization-goal: "savings"
    
  • Cluster level: Add either or both of the following parameter:value pairs to the clusterDefaultConfig values in a cluster-defaults.yaml file. Example:
    cpuOptimizationGoal: "reliability"
    memoryOptimizationGoal: "savings"
    

Specify what to optimize (set an optimization policy): requests and limits, requests, or nothing

By default, only CPU and memory requests (not limits) are optimized for each container in a workload. You can adjust this value to optimize requests and limits; you can also exclude containers (such as sidecar containers) from optimization.

Steps

To specify what to optimize:
UI:

  1. From the left navigation, click Optimize Live > Workloads, then find and click the name of the workload you want to work with.
  2. On the Workload page, click Config.
  3. In the Containers section, expand the container you want to work with.
  4. In the Configure CPU and Configure Memory sections, select what you want to optimize.
    • To exclude a container (such as a sidecar container) from optimzation : In both the Configure CPU and Configure Memory sections, select Don’t optimize.
  5. Repeat as needed for containers in the workload.
  6. To save your changes, click Update.

Annotations:
Decide at which level to set the value and annotate accordingly (refer to Configure by using annotations for details):

  • Workload or namespace level: Annotate the workload YAML manifest or the namespace YAML manifest with one or both of the following annotations. Example:

        live.stormforge.io/containers.cpu.optimization-policy: "RequestsAndLimits"
        live.stormforge.io/containers.memory.optimization-policy: "RequestsAndLimits"
    
  • Cluster level: Add either or both of the following parameter:value pairs to the clusterDefaultConfig values in a cluster-defaults.yaml file. Example:

    containersCpuOptimizationPolicy: "RequestsAndLimits"
    containersMemoryOptimizationPolicy: "RequestsAndLimits"
    

Examples: Setting container-specific defaults

Suppose you want to set container-specific defaults, as in these examples:

  • Default to optimizing CPU and memory requests only (and not limits). Several named containers are given exceptions: for the server and api containers, optimize requests and limits; for the sidecar container, do not optimize anything.

    • Workload or namespace level: In the workload YAML manifest or the namespace YAML manifest, add:
      live.stormforge.io/containers.cpu.optimization-policy: "RequestsOnly,server=RequestsAndLimits,api=RequestsAndLimits,sidecar=DoNotOptimize"
      live.stormforge.io/containers.memory.optimization-policy: "RequestsOnly,server=RequestsAndLimits,api=RequestsAndLimits,sidecar=DoNotOptimize"
      
    • Cluster level: In the clusterDefaultConfig values in the cluster-defaults.yaml file, add:
      containersCpuOptimizationPolicy: "RequestsOnly,server=RequestsAndLimits,api=RequestsAndLimits,sidecar=DoNotOptimize"
      containersMemoryOptimizationPolicy: "RequestsOnly,server=RequestsAndLimits,api=RequestsAndLimits,sidecar=DoNotOptimize"
      
  • Assume all containers are set to optimize RequestsOnly (the default value). Suppose you want to optimize the server container CPU and memory requests and limits, and not optimize the sidecar container:

    • In the workload YAML manifest or the namespace YAML manifest, add:
      live.stormforge.io/containers.cpu.optimization-policy: server="RequestsAndLimits",sidecar="DoNotOptimize"
      live.stormforge.io/containers.memory.optimization-policy: server="RequestsAndLimits",sidecar="DoNotOptimize"
      
    • In the clusterDefaultConfig values in a cluster-defaults.yaml file, add:
      containersCpuOptimizationPolicy: server="RequestsAndLimits",sidecar="DoNotOptimize"
      containersMemoryOptimizationPolicy: server="RequestsAndLimits",sidecar="DoNotOptimize"
      

    In this example, the optimization policies are updated for the server and sidecar containers only.

Change the limit-to-request ratio (limitRequestRatio)

Important: Change this container-level setting only if you need a specific limit-to-request ratio requirement to ensure resource consumption doesn’t exceed requests.

Concepts and examples

Optimize Live uses this ratio to calculate recommended CPU and memory limits:
Recommended limits = recommended requests * limitRequestRatio

The default limitRequestRatio is 2.0, which means that the recommended limit will be double the recommended requests. For example, if the recommended CPU requests value is 100m, then the recommended limit would be 200m.

Optimize Live always calculates recommended CPU and memory limits values, but doesn’t apply them if a container is configured to optimize requests only.

You can configure the limitRequestRatio based on your container needs:

  • 1.0 provides Guaranteed Quality of Service
  • 2.0 is the default value
  • A custom value equal to or greater than 1.0 (to two decimal places) provides more room for spikes or changes in consumption

Using limitRequestRatio in conjunction with resource limits
You can use both the limitRequestRatio and resource limits values together — they are not mutually exclusive. Optimize Live calculates the recommended limits value and then adjusts it if needed.

Suppose:

  • You want to ensure a workload has at least 2 cores for startup requirements.
  • You want a limitRequestRatio of 1.33.

Your container settings might look something like this:

containerSettings: 
  - cpu:
      requests:
        min: 20m
        max: 2000m
      limits:
        min: 2000m
        max: 16000m
        limitRequestRatio: 1.33

If Optimize Live recommends a cpu.requests value of 100m, then the calculated cpu.limits value is 133m (100 x 1.33), which is lower than cpu.limits.min of 2000m. Optimize Live would adjust the recommended cpu.limits value to 2000m to respect the cpu.limits.min=2000m setting.

Steps

To change the limitRequestRatio setting:
UI:

  1. Navigate to the workload details page and click Config.
  2. Expand the container you want to work with.
  3. In both the CPU and Memory sections, set the Limit Request Ratio to a value equal to or greater than 1.0 (up to two decimal places).
    Tip:
    • 1.0 provides Guaranteed Quality of Service
    • 2.0 is the default value
  4. Repeat as needed for other containers.
  5. To save your changes, click Update.
Annotations:

Decide at which level to set the value and annotate accordingly (refer to Configure by using annotations for details):

  • Workload or namespace level: Annotate the workload YAML manifest or the namespace YAML manifest with one or both of the following annotations. Example:
    live.stormforge.io/containers.cpu.limits.limit-request-ratio: "1.33"
    live.stormforge.io/containers.memory.limits.limit-request-ratio: "1.33"
    
  • Cluster level: Add either or both of the following parameter:value pairs to the clusterDefaultConfig values in a cluster-defaults.yaml file. Example:
    containersCpuLimitsLimitRequestRatio: "1.33"
    containersMemoryLimitsLimitRequestRatio: "1.33"
    
Examples: Setting container-specific defaults

Suppose you want to set container-specific defaults, as in these examples:

  • Set a ratio of 1.33 for all containers (overriding the default value of 2.0); specify exceptions for the server and api containers:

    • Workload or namespace level: In the workload YAML manifest or the namespace YAML manifest, add:
      live.stormforge.io/containers.cpu.limits.limit-request-ratio: "1.33,server=1.4,api=2.0"
      live.stormforge.io/containers.memory.limits.limit-request-ratio: "1.33,server=1.4,api=2.0"
      
    • Cluster level: In the clusterDefaultConfig values in the cluster-defaults.yaml file, add:
      containersCpuLimitsLimitRequestRatio: "1.33,server=1.4,api=2.0"
      containersMemoryLimitsLimitRequestRatio: "1.33,server=1.4,api=2.0"
      
  • Override the current value for the server container only:

    • In the workload YAML manifest or the namespace YAML manifest, add:

      live.stormforge.io/containers.cpu.limits.limit-request-ratio: "server=1.4"
      live.stormforge.io/containers.memory.limits.limit-request-ratio: "server=1.4"
      
    • In the clusterDefaultConfig values in the cluster-defaults.yaml file, add:

      containersCpuLimitsLimitRequestRatio: "server=1.4"
      containersMemoryLimitsLimitRequestRatio: "server=1.4"
      

Configure by using annotations

Define default settings at the workload, namespace, and cluster levels by using annotations

Configure using the UI and CLI

Machine learning removes the toil of configuration, but you can still change settings to experiment with workload recommendations.

Last modified March 12, 2024