Advanced Experiments
4 minute read
Prerequisites
- Kubernetes Cluster
kubectl
properly configuredstormforge
CLI Tool- StormForge Optimize Pro Controller installed
This example deploys Elasticsearch and requires more resources than the Quick Start example, so you will need something larger than a typical minikube cluster. A four-node cluster with 32 total vCPUs (8 on each node) and 64GB total memory (16GB on each node) is typically sufficient.
Experiment Lifecycle
Creating a StormForge Optimize Pro experiment stores the experiment state in your cluster. When using the platform, the experiment definition is also synchronized to the Optimize Pro API for access to the machine learning capabilities. No additional objects are created until trial assignments have been suggested (either manually or using the API — see the next section on adding manual trials).
After assignments have been suggested, a trial run will start generating workloads for your cluster. The creation of a trial object populated with assignments will initiate the following work:
- If the experiment contains setup tasks, a new job will be created for that work.
- The patches defined in the experiment are applied to the cluster.
- The status of all patched objects is monitored, the trial run will wait for them to stabilize.
- The trial job specified in the experiment is created (the default behavior simply executes a timed sleep).
- Upon completion of the trial job, metric values are collected.
- If the experiment contains setup tasks, another job will be created to clean up the state created by the initial setup task job.
Tutorial Manifests
You can find the manifests for this tutorial in the elasticsearch
directory of the examples
repository.
service-account.yaml
- This experiment will use StormForge Optimize Pro “setup tasks”. Setup tasks are a simplified way to apply bulk state changes to a cluster (i.e. installing and uninstalling an application or its components) before and after a trial run. To use setup tasks, we will create a separate service account with additional privileges necessary to make these modifications.
For this example, the RBAC-generated manifests are already provided for you in this file, so you don’t need generate them. Typically, you generate them by running stormforge rbac <EXPERIMENT_FILE>.yaml > <EXPERIMENT_FILE>-rbac.yaml
and applying that file when you run your experiment.
experiment.yaml
- The actual experiment object manifest. This includes the definition of the experiment itself (in terms of assignable parameters and observable metrics), updated cluster role-based access control (RBAC), and the instructions for carrying out the experiment (in terms of patches and metric queries). Consider editing the parameter ranges and changing the experiment name to avoid conflicting with other experiments in the cluster.
rally-config.yaml
- This experiment makes use of rally to test Elasticsearch. This contains the configuration for rally.
Running the Experiment
You’ll need to apply the manifests listed above for our experiment.
$ kubectl apply -f https://raw.githubusercontent.com/thestormforge/examples/main/elasticsearch/service-account.yaml
serviceaccount/stormforge created
clusterrolebinding.rbac.authorization.k8s.io/stormforge-cluster-admin created
$ kubectl apply -f https://raw.githubusercontent.com/thestormforge/examples/main/elasticsearch/rally-config.yaml
configmap/rally-ini created
$ kubectl apply -f https://raw.githubusercontent.com/thestormforge/examples/main/elasticsearch/experiment.yaml
experiment.optimize.stormforge.io/elasticsearch-example created
Verify all resources are present:
$ kubectl get experiment,sa,cm
NAME STATUS
experiment.optimize.stormforge.io/elasticsearch-example Running
NAME SECRETS AGE
serviceaccount/default 1 4h7m
serviceaccount/stormforge 1 36s
NAME DATA AGE
configmap/rally-ini 1 23s
As soon as the experiment is created, StormForge machine learning will begin creating and running trials automatically. You can view trial status by searching for trial objects:
$ kubectl get trial -l stormforge.io/experiment=elasticsearch-example
NAME STATUS ASSIGNMENTS VALUES
elasticsearch-example-kzzph Setting up memory=1500, cpu=750, replicas=3, heap_percent=50
Monitoring the Experiment
Both experiments
and trials
are created as custom Kubernetes objects.
You can see a summary of the objects using kubectl get trials,experiments
. On compatible clusters, trial objects will also display their parameter assignments and (upon completion) observed values.
The experiment objects themselves will not have their state modified over the course of a trial run: After they’re created, they represent a generally static state.
Trial objects will undergo a number of state progressions over the course of a trial run.
These progressions can be monitored by watching the “status” portion of the trial object (for example, when viewing kubectl get trials -o yaml <TRIAL NAME>
).
The trial object will also own several (one to three) job objects depending on the experiment. Those jobs will be labeled using the trial name (for example, trial=<name>
) and are typically named using the trial name as a prefix.
The -create
and -delete
suffixes on job names indicate setup tasks (also labeled with role=trialSetup
).
Collecting Experiment Output
Once an experiment is underway and some trials have completed, you can get the trial results via kubectl
:
kubectl get trials -l stormforge.io/experiment=elasticsearch-example
Rerunning the Experiment
After a trial run is complete, a new trial will be generated automatically until the number of trials created reaches the configured experimentBudget
.