Optimize Live

Optimize Live release history

Version 0.7.8

  • Permissions issue during upgrade

    This release fixes a permissions issue that sometimes caused the TSDB to crash when upgrading an existing Optimize Live installation.

Version 0.7.7

  • All components now run as non-root

    Individual components (TSDB, Applier, Recommender, Grafana) now run with runAsNonRoot: true set in their PodSecurityContext. The Controller continues to run as non-root by default. This feature is helpful if you deploy Optimize Live in clusters that have security policies that require all containers to run as non-root.

  • Improved handling of Datadog rate limit errors

    The TSDB now gracefully handles HTTP 429 responses from the Datadog API. If Datadog is your metrics provider, you’ll see better performance when the Datadog rate limit is reached.

Version 0.7.6


  • Support for DaemonSet optimization

    Optimize Live can now optimize DaemonSets in workloads, resulting in even more resource savings.

  • You can now specify any Grafana image or version

    The Controller can now install Grafana using the image repository and tag that you specify in the Helm chart values.yaml file. Previously, the Controller installed the latest version of Grafana from the official registry only.

    In the values.yaml file, use this format:

            repository: docker.io/grafana/grafana
            pullPolicy: IfNotPresent
            tag: 8.2.0

Version 0.7.5


  • Support for workloads that scale based on custom metrics in the HorizontalPodAutoscaler

    Optimize Live now produces a recommendation to size the workload to best align with the currently configured HorizontalPodAutoscaler custom metric. Previously, CPU utilization metrics were the only supported HorizontalPodAutoscaler metric.

Version 0.7.4

Recommender and Controller

  • Show recommendations even if some workloads in an application fail

    Optimize Live now, by default, shows recommendations even if it couldn’t generate recommendations for all discovered workloads (for example, when workloads crash or fail, or when new workloads don’t yet have enough metrics data).

    Previously, recommendations were shown only if they were computed for all discovered workloads. To preserve this behavior, set FF_ONLY_COMPLETE_RECOMMENDATIONS=true in the extraEnvVars section of the Helm chart.

UI enhancements

  • Launch from the left navigation

    Launch or switch between Optimize Live and Optimize Pro from the left navigation rather than from the tabs within an application. This update takes you to your applications and recommendations faster.

Version 0.7.3


  • Deleting a Live object now deletes the corresponding application

    When you delete a Live object from your cluster, Optimize Live now also deletes the application from the UI and the API. To restore the original behavior (in which the application isn’t deleted from the UI and API), label the Live object by running this command:

    kubectl label -n stormforge-system live/my-applive.optimize.stormforge.io/skipSync=skip

  • Grafana cleanup when uninstalling Optimize Live

    When you uninstall Optimize Live, we now ensure all Grafana processes are also deleted.


  • Backfill duration of 0s now kicks off metrics collection

    We now start collecting metrics when you configure the TSDB to skip backfilling (TSDB_BACKFILL_DURATION=0s). In previous releases, this setting didn’t kick off metrics collection.

Version 0.7.2


  • Expose recommendation count, recommendation tx/timestamp metrics

    The following optimize live metrics are available via /metrics endpoint:

    • optimize_live_recommendation_count, which displays a count of the most recent number of recommendations received
    • optimize_live_recommendation_timestamp, which displays a timestamp of when the last set of recommendations were made
    • optimize_live_tsdb_series_timestamp, which displays a timestamp for each top level metric we ingest (limits, requests, usage, etc.)
  • Limit Datadog query length when querying HPA metrics

    We now ensure that queries sent via the Datadog API don’t exceed Datadog’s maximum query length of 8000 characters. Previously, this check was not in place when we added support for HPA recommendations.


  • Support for pvc-less TSDB

    You can now configure the TSDB to run without a PV/PVC by setting TSDB_NO_PVC="true". Because this makes the TSDB data ephemeral, you should do it only in specific situations. The TSDB_PVC_SIZE setting can still be used to set a size limit when there is no PVC.

  • Support for limitRequestRatio configuration parameter

    You can now configure how much headroom to add to the request recommendation for the limit. As the name suggests, this is a ratio between the limit and request. By default, this ratio is set to 1.2, which means that the limit recommendation is set to the requests recommendation plus 20%.

  • Reducing the number of reconciliations via a feature flag

    In large environments, you might choose to reduce the number of watches on the API server. To configure the controller to no longer watch components that it owns, set the FF_NO_OWNS environment variable. When this is set, the controller no longer watches for events from the TSDB, recommender, or applier resources.

  • Add diff on tsdb and recommender ConfigMaps when debug mode is enabled

    When DEBUG is enabled, you’ll see a diff of the tsdb and recommender ConfigMaps in the logs, making it easier to discover what was changed during a reconcile.

  • Sort discovered HPAs

    When multiple HPAs are configured for a target, we now sort this list to prevent unnecessary configuration churn.

  • One lookup for CPU and Memory targets

    We now do one lookup for both CPU and memory targets. Previously we did separate lookups for CPU and Memory targets, which created situations where we would have unequal targets matched for CPU and Memory recommendations.

  • Change the log level to error when no targets found

    For easier troubleshooting, we now set the log level to error when no targets are found. Previously, because we would query again, we would log this at the info level.

  • Use Interval instead of deprecated UpdateInterval

    We now use Interval only.

Version 0.7.1

Controller and Applier

  • Support for reducing the resources used by a cluster

    If you have many applications (for example, upwards of a couple hundred) and apply recommendations conservatively (for example, every few days), you can set FF_PATCHER="true" in the extraEnvVar section of your Helm chart. This consolidates and simplifies the cluster component stack and does not negatively affect cluster performance.

  • Support for persisting patches to ConfigMaps

    When you set FF_PATCHER="true", you can now have the Controller write a patch to a ConfigMap by setting FF_PERSIST_PATCH="true". Writing a patch to a ConfigMap is useful for troubleshooting cluster resource use.

  • Improved HPA logging

    If no HPA targets are discovered, this information is now logged at the info level. Previously, it was logged at the error level.

  • No recommendations in an HPA setup

    In some HPA setups, the Recommender might not discover HPA targets and therefore cannot generate recommendations. Sometimes this scenario occurs because the Kubernetes version and kube-state-metrics version are not compatible.

    Workaround: Try downgrading your kube-state-metrics version.

Version 0.7.0

  • Updated Helm chart value: DEBUG=false, and set DEBUG to Boolean in the values schema
  • Grafana updates simplify the information you see on dashboards

UI enhancements

On the Configure Recommendations page:

  • In the Optional Settings section, you can now specify the CPU target utilization of the HPA recommendation.
  • In the Advanced Settings section, you can choose either of the following:
    • Enable Guaranteed Quality of Service.
    • Exclude Memory Limits, CPU limits, or both from recommendations.

To access the Advanced Settings, contact your StormForge sales rep.

Version 0.6.0

  • Added support for HPA constraints for min and max target CPU utilization
  • Added support for collecting min and max replica metrics to provide better recommendations

Version 0.5.0 (HPA support)


  • Support for jsonpath custom patches
  • Support for generating HPA patches


  • Support for bidimensional autoscaling
  • Support for providing target utilization recommendations alongside CPU and memory
  • Enabled HPA lookup by default
  • Added recommendation labels to the dashboard to better filter results
  • We now correctly look up existing Live resources when syncing from the API

UI enhancements

  • Added a progress bar that shows the progress of the TSDB backfill
  • Added support for maximum CPU and memory limits
  • Added a clusters list page

Version 0.4.0


  • Support for custom patches:
    • You can now create a Live custom resource definition (CRD) to provision and configure a new Optimize Live instance
    • You can now apply recommendations via a Live object
  • Support for pods with multiple containers
  • Support for arm64 architecture
  • Updated Grafana dashboards:
    • You can choose low, medium, or high risk tolerance for both CPU and memory when viewing recommendation summaries
    • You can now see HPA-related data
  • The Recommender now provides recommendations that honor the maximum limits that you specify
  • You can now specify the following values in a Live object:
    • Maximum bound for CPU and memory requests
    • Minimum and maximum CPU and memory limits


  • Significantly reduced TSDB ConfigMap size, allowing now up to 700+ targets per Live object (from previously only 100+ targets). For testing and troubleshooting, you can still add raw queries to the Controller’s configuration file, but you must add them manually.

Version 0.3.0

UI enhancements
  • You can now set CPU and memory minimum limits when you configure recommendations
  • Deleting an application in the UI now also deletes the application from the cluster
  • A progress bar now displays data backfilling progress
  • New search capability helps you to find your applications faster

Version 0.2.2

  • Qualify Datadog metrics with cluster name
  • Suppress log messages during backfill of data
  • Fixed bug that could cause the recommender to stall

Version 0.2.1

  • Added support for non-standard replicaset owners (e.g., rollouts)
  • The Grafana dashboard has been updated to highlight the containers’ maximum usage
  • The recommender now supports varying number of replicas
  • The TSDB allows for customization of the persistent volume
  • Added DEBUG log level for all the components
  • The Controller supports proxies
  • Beta support for Datadog as a metrics provider

Version 0.1.6

Optimize Live Launch

  • Controller deploys the TSDB, the recommender, the applier and Grafana deployment
  • Support for metrics stores in Prometheus