Experiment Generation Tutorial

Learn how to generate an experiment so you can get up and running quicker

Prerequisites

This example will deploy the Voting Web App which requires more resources then the quick start example, therefore you will need something larger then a typical minikube cluster. We recommend using a cluster with 4 nodes, 16 vCPUs (4 on each node) and 32GB of memory (8 on each node).

Experiment Creation

Guided walkthrough

Once the StormForge Optimize controller is installed into your cluster, it can scan your cluster resources to help generate the experiment file through identifying all of the tunable parameters (e.g. CPU, Memory and Replicas) within your application.

If you use Locust or StormForge as your load test, you can simply start the experiment generation process by running the following command:

$ redskyctl run

You can follow along with the prompts to generate your experiment. You will be asked to do the following:

  1. Select a load test. When using a StormForge performance test, all you need is to download and authenticate the Forge CLI. Once this is installed, the experiment generation flow will detect all of your load tests for you to select from. You can alternatively use a Locust load test by specifying the path where it is located.
  2. Select the Kubernetes namespace. Identify which namespaces your application resources are in. You may select multiple namespaces.
  3. Select specific application resources to be optimized. You can use label selectors to first filter down which resources to scan within your selected namespaces. By default we will scan all deployments and stateful sets in the selected namespaces.
  4. Determine the parameters to discover. Next, you can identify which application resources to use as parameters - CPU, memory, and replicas. Scanning for parameters allows the controller to generate the necessary patches to update the application parameters during the optimize experiment lifecycle. Additionally, it will detect the current values for these parameters to be used as the baseline, and automatically sets the parameter range for the machine learning to test between.
  5. Select optimization objectives. This allows you to choose from a common set of metrics for optimization, such as cost and latency. When using the free tier you can select up to two metrics to optimize for. We recommend choosing one cost and one performance metric to find the configuration that best balances performance and cost.
  6. Choose whether to run your experiment. At this point, you can immediately kick off your experiment, or you can exit and run your experiment at a later time. Using ctrl+t at this point will show you the full experiment file that was generated.

Congrats! Now you that you have run your first experiment, visit your StormForge to review the results. Want to use a different load test or custom metrics? Use the app.yaml file outlined in the next section.

Remove the Experiment

Once your experiment is finished, you can remove the resources generated in your cluster by running the following command, replacing application=default with the name of your application:

$ kubectl delete experiments -l redskyops.dev/application=default

followed by:

$ kubectl delete secrets,serviceaccounts,clusterroles,clusterrolebindings -l redskyops.dev/application=default

Note: The application name is autogenerated from the namespaces scanned and will be the prefix of the experiment name. I.e. the application of the experiment named default-bd8fe4c2 is default

Advanced

Application File

If you have a different load test other than StormForge or Locust, you can start with a basic declarative application configuration file, as we have done in our Locust metrics example. Experiment generation will scan the manifests provided to produce an experiment file including the necessary patches for resources and replicas parameters, metric queries, and load test configuration. You can optionally customize your experiment file further after generating it.

To write your own experiment generation file will need to define:

  • resources: The location of the manifests for the application you would like to tune. You can use in cluster resources, Helm, files, directories, URLs, or Git repositories.
  • scenarios: The name and location of your load test file. You can use a StormForge Performance Test, Locust test, or a custom trial pod.
  • objectives: The metrics you want to optimize for. View github for a full list of available Locust metrics and StormForge performance test metrics. Alternatively, you can use your own Prometheus or Datadog metrics and write custom queries. You can optionally define metric constraints and add non-optimized metrics.

Read more about the fields that are available in the application reference or generate a fully commented app.yaml template that shows how to use all of the above options with the following command:

$ redskyctl generate application

Generate an Experiment File

After cloning the thestormforge/examples repo, we can generate the experiment file and save it locally for our Locust metrics example by running:

$ redskyctl generate experiment -f app.yaml > experiment.yaml

Next, generate and apply the RBAC permissions to allow the controller to patch the deployment:

$ redskyctl generate rbac -f experiment.yaml | kubectl apply -f -

Finally, run your experiment by applying the voting web app resources and experiment file to your cluster:

$ kubectl apply -f <(kustomize build) -f experiment.yaml

Export a Configuration

After your experiment completes, you have the ability to export the manifests that are patched with the parameter values of your chosen configuration. Select the trial you would like to export and use the following command, replacing votingapp-100-clients-020 with the name of the trial you would like to export:

$ redskyctl export -f app.yaml votingapp-100-clients-020

Questions? Reach out to us via Slack or email.

Last modified May 3, 2021