Load Injection (Load Scenarios)

MockServer can drive outbound API traffic on demand using Load Scenarios — a declarative, bounded load generator built into the server. A load scenario is an ordered list of templated request steps driven through a sequence of stages (a load profile), with per-iteration variable data. Each stage either holds or ramps the number of concurrent virtual users (VU, closed model), holds or ramps an arrival rate in iterations per second (RATE, open model), or pauses. Results feed both the metrics histograms and the SLO sample store so a load run can be asserted with PUT /mockserver/verifySLO.

This makes MockServer useful for resilience verification: inject a chaos fault (see Chaos Testing), apply load, then assert that your SLOs held — all from a single control plane, with no external load tool required.

Registry model: load, then trigger. Scenarios are organised as a registry of named scenarios. You first load (register) a scenario by name with PUT /mockserver/loadScenario — this does not run it — then trigger one or many by name with PUT /mockserver/loadScenario/start to run them concurrently, each with its own optional start delay. Loading is always allowed; triggering a run is off by default: start returns 403 Forbidden until loadGenerationEnabled=true. Hard caps on concurrent scenarios, virtual users, in-flight requests, RPS, duration, and step count prevent the feature from self-DoS-ing the server. Scenarios can be preloaded at startup from a JSON file via mockserver.loadScenarioInitializationJsonPath.

Quickstart

Enable load generation, load a 10-VU ramp-then-hold scenario, then trigger it:

# 1. Enable load generation (set once at startup or via the config endpoint)
docker run -e MOCKSERVER_LOAD_GENERATION_ENABLED=true mockserver/mockserver

# 2. LOAD (register) a scenario — this does NOT run it
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
  -H "Content-Type: application/json" \
  -d '{
    "name": "checkout-load",
    "templateType": "VELOCITY",
    "startDelayMillis": 0,
    "profile": {
      "stages": [
        { "type": "VU", "startVus": 1, "endVus": 10, "durationMillis": 30000, "curve": "LINEAR" },
        { "type": "VU", "vus": 10, "durationMillis": 60000 }
      ]
    },
    "steps": [
      {
        "request": {
          "method": "GET",
          "path": "/api/orders/$iteration.index",
          "headers": { "Host": ["orders.svc:8080"] },
          "socketAddress": { "host": "orders.svc", "port": 8080, "scheme": "HTTP" }
        }
      }
    ]
  }'

# 3. TRIGGER it to run (one or many names at once)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start \
  -H "Content-Type: application/json" \
  -d '{ "name": "checkout-load" }'

# 4. Poll progress (one scenario, or list all)
curl -s http://localhost:1080/mockserver/loadScenario/checkout-load
curl -s http://localhost:1080/mockserver/loadScenario

# 5. Stop it (stays registered, STOPPED — can be re-triggered)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/stop \
  -d '{ "name": "checkout-load" }'

Control-Plane API

All endpoints are control-plane endpoints — they respect control-plane authentication (mTLS / JWT) when configured. The flow is load (register) → trigger (run) by name.

Verb	Path	Behaviour
`PUT`	`/mockserver/loadScenario`	Load (register) a scenario by `name` — it is staged in the `LOADED` state but does not run. Allowed even when `loadGenerationEnabled=false` (no traffic). `400` with a JSON error when invalid or a cap is exceeded; `200 {status:"loaded", name, state:"LOADED"}` otherwise. Loading the same name replaces the prior definition.
`GET`	`/mockserver/loadScenario`	List all registered scenarios: `{ scenarios: [ { name, state, startDelayMillis, definition, …live status fields } ] }`. `state` ∈ `LOADED` / `PENDING` / `RUNNING` / `COMPLETED` / `STOPPED`. Live fields (when active/run): `stageIndex`, `stageType`, `currentTarget`, `currentVus`, `requestsSent`, `succeeded`, `failed`, `p50/p95/p99Millis`, `runId`, `startedAt`, `endedAt`.
`GET`	`/mockserver/loadScenario/{name}`	Return one registered scenario (definition + state + status). `404` if not registered.
`PUT`	`/mockserver/loadScenario/start`	Trigger one or more registered scenarios to run concurrently. Body `{"names":["a","b"]}` or `{"name":"a"}`. Each honours its own `startDelayMillis`. Requires `loadGenerationEnabled=true` (else `403`); `404` if a name isn't registered; `400` if it would exceed `loadGenerationMaxConcurrentScenarios`. Returns the triggered names and resulting states (`PENDING` / `RUNNING`).
`PUT`	`/mockserver/loadScenario/stop`	Stop running scenario(s). Body `{"names":[…]}`, `{"all":true}`, or empty (stop all). Stopped scenarios stay registered (`STOPPED`) and can be re-triggered.
`DELETE`	`/mockserver/loadScenario/{name}`	Remove one scenario from the registry (stops it first if running).
`DELETE`	`/mockserver/loadScenario`	Clear the whole registry (stops all running). Idempotent.

Trigger several at once — each begins at its own offset via startDelayMillis:

curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start \
  -d '{ "names": ["checkout-load", "background-poller"] }'

Load Scenario Model

A load scenario is a JSON object with the following top-level fields:

Field	Type	Required	Description
`name`	string	Yes	The scenario name — the unique registry key. Loading the same name replaces the prior definition. Appears in status, metric labels, and logs.
`startDelayMillis`	integer	No	Delay in milliseconds, applied after the scenario is triggered, before its iterations begin (default `0`). A positive value means the scenario is `PENDING` until the delay elapses, then `RUNNING`. Lets several triggered scenarios begin at staggered offsets.
`steps`	array	Yes	Ordered list of request steps (see Steps below). Maximum 50 steps.
`profile`	object	Yes	The load profile — an ordered list of `stages` run in sequence (see Load Profile below).
`templateType`	string	No	Template engine for rendering step fields: `VELOCITY` (default) or `MUSTACHE`. `JAVASCRIPT` is not supported for load steps and is rejected with a `400`.
`maxRequests`	integer	No	Stop the scenario once this many requests have been dispatched, even if the duration has not elapsed. Useful for budget-bounded runs.
`labels`	object	No	Scenario-level custom metric labels (string key/value pairs). Keys must be listed in `mockserver.loadGenerationMetricLabels` to appear in Prometheus; all keys appear in OpenTelemetry automatically. See Observability & Metrics.

Steps

Each step defines a request to fire and an optional pause after it:

Field	Type	Description
`request`	HttpRequest	The request to fire. Reuses the same `HttpRequest` model as expectations. Template expressions live in the `path` and `body` fields. Set `socketAddress` to point at the target service.
`thinkTime`	Delay	Optional pause between this step and the next (a `{ "timeUnit": "MILLISECONDS", "value": 100 }` object). Implemented with a non-blocking scheduler — no thread is blocked during think time.
`name`	string	Optional step name. When set, it is used as the `route` metric label for this step instead of the auto-templatized request path. Use this to group requests to multiple paths under one label, or to give a step a human-readable name in metric charts.
`labels`	object	Optional step-level custom metric labels (string key/value pairs). Override the scenario-level `labels` for this step. Same Prometheus allowlist requirement applies.

Load Profile

The profile is an ordered list of stages (maximum 20) run one after another. The total run length is the sum of the stage durations. Each stage is one of three types:

VU (closed model) — hold or ramp the number of concurrent virtual users. Each VU loops the steps back-to-back; throughput is whatever the target can sustain. Answers "how does my service behave with N concurrent clients?"
RATE (open model) — hold or ramp an arrival rate in iterations per second. MockServer starts new iterations on schedule (auto-scaling the virtual-user pool to run them) regardless of how fast the target responds. Answers "how does my service behave at R requests/second?" — this is the model that exposes queue build-up and tail latency.
PAUSE — drive no load for the duration (virtual users drain). Use it to separate stages, e.g. let the target recover between a spike and a soak.

A stage holds a value (set vus / rate) or ramps between two values (set startVus+endVus / startRate+endRate). A ramp follows a curve:

LINEAR (default) — straight line.
QUADRATIC — ease-in: slow at first, then accelerating.
EXPONENTIAL — a steeper ease-in (handles ramps starting from zero correctly).

Field	Type	Description
`type`	string	`VU`, `RATE` or `PAUSE`.
`durationMillis`	integer	How long this stage runs (milliseconds, > 0). The total across all stages must not exceed 3 600 000 ms (1 hour).
`curve`	string	Ramp shape for a ramping stage: `LINEAR` (default), `QUADRATIC` or `EXPONENTIAL`. Ignored for holds and pauses.
`vus`	integer	`VU` hold — number of concurrent virtual users to hold. Maximum 50.
`startVus` / `endVus`	integer	`VU` ramp — virtual users at the start and end of the ramp. Maximum 50.
`rate`	number	`RATE` hold — arrival rate in iterations per second to hold. Maximum 5000.
`startRate` / `endRate`	number	`RATE` ramp — arrival rate (iterations/second) at the start and end of the ramp. Maximum 5000.
`maxVus`	integer	`RATE` only — optional cap on the auto-scaling virtual-user pool used to run the started iterations (defaults to the global virtual-user cap of 50). If the rate cannot be met within this cap, the shortfall is recorded as a `rate_limit` throttle.

Per-Iteration Template Variables

Each iteration gets a fresh iteration context injected alongside the standard request variable. Use it to vary data across iterations and virtual users without external data files:

Variable	Meaning	Velocity
`index`	Global iteration index across all VUs (0-based)	`$iteration.index`
`vuId`	ID of the virtual user running this iteration (0-based)	`$iteration.vuId`
`vuIteration`	Iteration count within this specific VU (0-based)	`$iteration.vuIteration`
`elapsedMillis`	Milliseconds since the scenario started	`$iteration.elapsedMillis`
`count`	Total requests dispatched so far (across all VUs)	`$iteration.count`

What fields are rendered. In v1, only the request path and body fields are rendered through the template engine. These are the most commonly templated fields — use them to vary the URL path or the request payload across iterations.

Examples

Each example below shows the same scenario across every MockServer client and the plain REST API. Expand a language to see how to build the load scenario, register it, start it, read its live status, and stop it.

The following examples drive load scenarios through MockServer using each client library and the plain REST API. They all follow the same registry workflow — register → start → read live status → stop. Load generation must be enabled (loadGenerationEnabled=true): registering is always allowed, but starting a run returns 403 when it is off.

A realistic multi-stage scenario: a linear RATE ramp (5 → 50 req/s, capped at 50 virtual users), then a 25-VU hold, then a PAUSE. Two Velocity-templated steps drive each iteration, startDelayMillis defers load briefly after start, and custom labels tag the metric series. The full lifecycle is exercised: register (does not run), start, list / read live status, stop, then clear the registry.

import org.mockserver.client.MockServerClient;
import org.mockserver.load.*;
import org.mockserver.model.Delay;
import org.mockserver.model.HttpTemplate;
import java.util.Map;
import java.util.concurrent.TimeUnit;

import static org.mockserver.model.HttpRequest.request;

MockServerClient client = new MockServerClient("localhost", 1080);

LoadScenario scenario = LoadScenario.loadScenario("checkout-load")
    .withTemplateType(HttpTemplate.TemplateType.VELOCITY)
    .withMaxRequests(100000)
    .withStartDelayMillis(500)
    .withLabels(Map.of("team", "payments", "env", "staging"))
    .withProfile(LoadProfile.of(
        LoadStage.rampRate(5, 50, 30000, RampCurve.LINEAR).withMaxVus(50),
        LoadStage.constantVus(25, 60000),
        LoadStage.pause(10000)
    ))
    .withSteps(
        LoadStep.loadStep(request().withMethod("GET").withPath("/products/$!iteration.index"))
            .withName("browse")
            .withThinkTime(new Delay(TimeUnit.MILLISECONDS, 500)),
        LoadStep.loadStep(request().withMethod("POST").withPath("/cart/checkout")
                .withBody("{\"item\":\"$!iteration.index\",\"qty\":1}"))
            .withName("checkout")
            .withLabels(Map.of("critical", "true"))
    );

client.loadScenario(scenario);                  // 1. register (does NOT start it yet)
client.startLoadScenarios("checkout-load");     // 2. start (requires loadGenerationEnabled=true)
String listing = client.loadScenarios();        // 3. list all registered scenarios
String status = client.getLoadScenario("checkout-load"); // live throughput / latency status
client.stopLoadScenarios("checkout-load");      // 4. stop (no args stops ALL running scenarios)
client.clearLoadScenarios();                     //    tidy up the registry

var mockServerClient = require('mockserver-client').mockServerClient;
var client = mockServerClient("localhost", 1080);

// The per-step field is `request` (a full HttpRequest), not `httpRequest`.
var scenario = {
    name: 'checkout-load',
    templateType: 'VELOCITY',
    maxRequests: 100000,
    startDelayMillis: 500,
    labels: { team: 'payments', env: 'staging' },
    profile: {
        stages: [
            { type: 'RATE', startRate: 5, endRate: 50, durationMillis: 30000, curve: 'LINEAR', maxVus: 50 },
            { type: 'VU', vus: 25, durationMillis: 60000 },
            { type: 'PAUSE', durationMillis: 10000 }
        ]
    },
    steps: [
        { name: 'browse', request: { method: 'GET', path: '/products/$!iteration.index' },
          thinkTime: { timeUnit: 'MILLISECONDS', value: 500 } },
        { name: 'checkout', labels: { critical: 'true' },
          request: { method: 'POST', path: '/cart/checkout',
                     headers: { 'Content-Type': ['application/json'] },
                     body: '{"item":"$!iteration.index","qty":1}' } }
    ]
};

(async function () {
    await client.loadScenario(scenario);                 // 1. register (does NOT start it yet)
    await client.startLoadScenarios('checkout-load');    // 2. start (requires loadGenerationEnabled=true)
    var listing = await client.loadScenarios();          // 3. list all registered scenarios
    var status = await client.getLoadScenario('checkout-load'); // live status
    await client.stopLoadScenarios('checkout-load');     // 4. stop (no arg stops ALL running scenarios)
    await client.clearLoadScenarios();                    //    tidy up the registry
})();

from mockserver import (Delay, HttpRequest, LoadProfile, LoadScenario,
                        LoadStage, LoadStep, MockServerClient)

scenario = LoadScenario(
    name="checkout-load",
    template_type="VELOCITY",
    max_requests=100000,
    start_delay_millis=500,
    labels={"team": "payments", "env": "staging"},
    profile=LoadProfile(stages=[
        LoadStage.rate_stage(30000, start_rate=5, end_rate=50, max_vus=50, curve="LINEAR"),
        LoadStage.vu_stage(60000, vus=25),
        LoadStage.pause_stage(10000),
    ]),
    steps=[
        LoadStep(name="browse",
                 request=HttpRequest(method="GET", path="/products/$!iteration.index"),
                 think_time=Delay(time_unit="MILLISECONDS", value=500)),
        LoadStep(name="checkout", labels={"critical": "true"},
                 request=HttpRequest(method="POST", path="/cart/checkout",
                                     body='{"item":"$!iteration.index","qty":1}')),
    ],
)

with MockServerClient("localhost", 1080) as client:
    client.load_scenario(scenario)                # 1. register (does NOT start it yet)
    client.start_load_scenarios("checkout-load")  # 2. start (requires loadGenerationEnabled=true)
    listing = client.load_scenarios()             # 3. list all registered scenarios
    status = client.get_load_scenario("checkout-load")  # live status
    client.stop_load_scenarios("checkout-load")   # 4. stop (None stops ALL running scenarios)
    client.clear_load_scenarios()                  #    tidy up the registry

require 'mockserver-client'
include MockServer

client = Client.new('localhost', 1080)

scenario = LoadScenario.new(
  name: 'checkout-load',
  template_type: 'VELOCITY',
  max_requests: 100_000,
  start_delay_millis: 500,
  labels: { 'team' => 'payments', 'env' => 'staging' },
  profile: LoadProfile.new(stages: [
    LoadStage.rate(30_000, start_rate: 5, end_rate: 50, max_vus: 50, curve: 'LINEAR'),
    LoadStage.vu(60_000, vus: 25),
    LoadStage.pause(10_000)
  ]),
  steps: [
    LoadStep.new(name: 'browse',
                 request: HttpRequest.new(method: 'GET', path: '/products/$!iteration.index'),
                 think_time: Delay.new(time_unit: 'MILLISECONDS', value: 500)),
    LoadStep.new(name: 'checkout', labels: { 'critical' => 'true' },
                 request: HttpRequest.new(method: 'POST', path: '/cart/checkout',
                                          body: '{"item":"$!iteration.index","qty":1}'))
  ]
)

client.load_scenario(scenario)               # 1. register (does NOT start it yet)
client.start_load_scenarios('checkout-load') # 2. start (requires loadGenerationEnabled=true)
client.load_scenarios                        # 3. list all registered scenarios
client.get_load_scenario('checkout-load')    # live status
client.stop_load_scenarios('checkout-load')  # 4. stop (nil stops ALL running scenarios)
client.clear_load_scenarios                  #    tidy up the registry
client.close

package main

import (
    mockserver "github.com/mock-server/mockserver-monorepo/mockserver-client-go"
)

func main() {
    client := mockserver.New("localhost", 1080)

    browse := mockserver.Request().Method("GET").Path("/products/$!iteration.index").Build()
    checkout := mockserver.Request().Method("POST").Path("/cart/checkout").
        Body(`{"item":"$!iteration.index","qty":1}`).Build()

    // MaxVus is an optional *int field on a RATE stage.
    rampStage := mockserver.RampRateStage(5, 50, 30000, mockserver.RampLinear)
    maxVus := 50
    rampStage.MaxVus = &maxVus

    scenario := mockserver.LoadScenario{
        Name:             "checkout-load",
        TemplateType:     "VELOCITY",
        MaxRequests:      100000,
        StartDelayMillis: 500,
        Labels:           map[string]string{"team": "payments", "env": "staging"},
        Profile: &mockserver.LoadProfile{
            Stages: []mockserver.LoadStage{
                rampStage,
                mockserver.ConstantVusStage(25, 60000),
                mockserver.PauseStage(10000),
            },
        },
        Steps: []mockserver.LoadStep{
            {Name: "browse", Request: &browse, ThinkTime: &mockserver.Delay{TimeUnit: "MILLISECONDS", Value: 500}},
            {Name: "checkout", Request: &checkout, Labels: map[string]string{"critical": "true"}},
        },
    }

    client.LoadScenario(scenario)              // 1. register (does NOT start it yet)
    client.StartLoadScenarios("checkout-load") // 2. start (requires loadGenerationEnabled=true)
    client.LoadScenarios()                     // 3. list all registered scenarios
    client.GetLoadScenario("checkout-load")    // live status
    client.StopLoadScenarios("checkout-load")  // 4. stop (no args stops ALL running scenarios)
    client.ClearLoadScenarios()                //    tidy up the registry
}

using MockServer.Client;
using MockServer.Client.Models;

using var client = new MockServerClient("localhost", 1080);

var scenario = new LoadScenario
{
    Name = "checkout-load",
    TemplateType = LoadTemplateType.VELOCITY,
    MaxRequests = 100000,
    StartDelayMillis = 500,
    Labels = new Dictionary<string, string> { ["team"] = "payments", ["env"] = "staging" },
    Profile = new LoadProfile
    {
        Stages =
        {
            new LoadStage { Type = LoadStageType.RATE, StartRate = 5, EndRate = 50,
                            DurationMillis = 30000, Curve = RampCurve.LINEAR, MaxVus = 50 },
            LoadStage.ConstantVus(25, 60000),
            LoadStage.Pause(10000)
        }
    },
    Steps = new List<LoadStep>
    {
        new() { Name = "browse",
                Request = HttpRequest.Request().WithMethod("GET").WithPath("/products/$!iteration.index"),
                ThinkTime = new Delay { TimeUnit = TimeUnit.MILLISECONDS, Value = 500 } },
        new() { Name = "checkout",
                Request = HttpRequest.Request().WithMethod("POST").WithPath("/cart/checkout")
                    .WithBody("{\"item\":\"$!iteration.index\",\"qty\":1}"),
                Labels = new Dictionary<string, string> { ["critical"] = "true" } }
    }
};

await client.LoadScenarioAsync(scenario);              // 1. register (does NOT start it yet)
await client.StartLoadScenariosAsync("checkout-load"); // 2. start (requires loadGenerationEnabled=true)
var listing = await client.LoadScenariosAsync();       // 3. list all registered scenarios
var status = await client.GetLoadScenarioAsync("checkout-load"); // live status
await client.StopLoadScenariosAsync("checkout-load");  // 4. stop (no args stops ALL running scenarios)
await client.ClearLoadScenariosAsync();                //    tidy up the registry

use mockserver_client::{
    ClientBuilder, Delay, HttpRequest, LoadProfile, LoadScenario, LoadStage, LoadStep, RampCurve,
};

let client = ClientBuilder::new("localhost", 1080).build().unwrap();

let profile = LoadProfile::of(vec![
    LoadStage::rate_ramp(5.0, 50.0, 30_000, RampCurve::Linear).max_vus(50),
    LoadStage::vu_hold(25, 60_000),
    LoadStage::pause(10_000),
]);
let steps = vec![
    LoadStep::new(HttpRequest::new().method("GET").path("/products/$!iteration.index"))
        .think_time(Delay::milliseconds(500)),
    LoadStep::new(HttpRequest::new().method("POST").path("/cart/checkout")
        .body(r#"{"item":"$!iteration.index","qty":1}"#)),
];
let scenario = LoadScenario::new("checkout-load", profile, steps)
    .template_type("VELOCITY")
    .max_requests(100_000)
    .start_delay_millis(500);

client.load_scenario(&scenario).unwrap();                // 1. register (does NOT start it yet)
client.start_load_scenarios(&["checkout-load"]).unwrap(); // 2. start (requires loadGenerationEnabled=true)
client.load_scenarios().unwrap();                         // 3. list all registered scenarios
client.get_load_scenario("checkout-load").unwrap();       // live status
client.stop_load_scenarios(&["checkout-load"]).unwrap();  // 4. stop (&[] stops ALL running scenarios)
client.clear_load_scenarios().unwrap();                   //    tidy up the registry

require_once 'vendor/autoload.php';

use MockServer\Delay;
use MockServer\HttpRequest;
use MockServer\LoadProfile;
use MockServer\LoadScenario;
use MockServer\LoadStage;
use MockServer\MockServerClient;

$client = new MockServerClient('localhost', 1080);

$scenario = LoadScenario::scenario('checkout-load')
    ->templateType('VELOCITY')
    ->maxRequests(100000)
    ->startDelayMillis(500)
    ->labels(['team' => 'payments', 'env' => 'staging'])
    ->profile(LoadProfile::of(
        LoadStage::rateRamp(5, 50, 30000, 'LINEAR')->maxVus(50),
        LoadStage::vuHold(25, 60000),
        LoadStage::pause(10000),
    ))
    ->addStep(
        HttpRequest::request()->method('GET')->path('/products/$iteration.index'),
        Delay::milliseconds(500),
        'browse',
    )
    ->addStep(
        HttpRequest::request()->method('POST')->path('/cart/checkout')
            ->body('{"item":"$iteration.index","qty":1}'),
        null,
        'checkout',
        ['critical' => 'true'],
    );

$client->loadScenario($scenario);              // 1. register (does NOT start it yet)
$client->startLoadScenarios('checkout-load');  // 2. start (requires loadGenerationEnabled=true)
$client->loadScenarios();                       // 3. list all registered scenarios
$client->getLoadScenario('checkout-load');      // live status
$client->stopLoadScenarios('checkout-load');    // 4. stop (null stops ALL running scenarios)
$client->clearLoadScenarios();                  //    tidy up the registry

# Start the server with load generation enabled:
#   docker run -e MOCKSERVER_LOAD_GENERATION_ENABLED=true mockserver/mockserver

# 1. REGISTER (does NOT run it) — PUT /mockserver/loadScenario
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
  -H "Content-Type: application/json" \
  -d '{
    "name": "checkout-load",
    "templateType": "VELOCITY",
    "maxRequests": 100000,
    "startDelayMillis": 500,
    "labels": { "team": "payments", "env": "staging" },
    "profile": { "stages": [
      { "type": "RATE", "startRate": 5, "endRate": 50, "durationMillis": 30000, "curve": "LINEAR", "maxVus": 50 },
      { "type": "VU", "vus": 25, "durationMillis": 60000 },
      { "type": "PAUSE", "durationMillis": 10000 }
    ] },
    "steps": [
      { "name": "browse", "request": { "method": "GET", "path": "/products/$!iteration.index" },
        "thinkTime": { "timeUnit": "MILLISECONDS", "value": 500 } },
      { "name": "checkout", "labels": { "critical": "true" },
        "request": { "method": "POST", "path": "/cart/checkout",
                     "body": "{\"item\":\"$!iteration.index\",\"qty\":1}" } }
    ]
  }'

# 2. START it (requires loadGenerationEnabled=true; else 403)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start \
  -d '{ "name": "checkout-load" }'

# 3. LIST all registered scenarios, and read one scenario's live status
curl -s http://localhost:1080/mockserver/loadScenario
curl -s http://localhost:1080/mockserver/loadScenario/checkout-load

# 4. STOP it (stays registered, STOPPED — can be re-triggered)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/stop \
  -d '{ "name": "checkout-load" }'

Ramp from 1 to 10 concurrent virtual users over 30 seconds, then hold 10 VUs for a minute. Each iteration fetches a different order derived from the global iteration index. runLoadScenario registers and starts in a single call (so it still requires loadGenerationEnabled=true).

LoadScenario scenario = LoadScenario.loadScenario("orders-ramp")
    .withProfile(LoadProfile.of(
        LoadStage.rampVus(1, 10, 30000, RampCurve.LINEAR),
        LoadStage.constantVus(10, 60000)
    ))
    .withSteps(
        LoadStep.loadStep(request().withMethod("GET").withPath("/api/orders/$!iteration.index"))
            .withThinkTime(new Delay(TimeUnit.MILLISECONDS, 20))
    );

client.runLoadScenario(scenario);          // register + start in one call
client.stopLoadScenarios("orders-ramp");   // stop when done

var scenario = {
    name: 'orders-ramp',
    profile: { stages: [
        { type: 'VU', startVus: 1, endVus: 10, durationMillis: 30000, curve: 'LINEAR' },
        { type: 'VU', vus: 10, durationMillis: 60000 }
    ] },
    steps: [
        { request: { method: 'GET', path: '/api/orders/$!iteration.index' },
          thinkTime: { timeUnit: 'MILLISECONDS', value: 20 } }
    ]
};

await client.runLoadScenario(scenario);     // register + start in one call
await client.stopLoadScenarios('orders-ramp');

scenario = LoadScenario(
    name="orders-ramp",
    profile=LoadProfile(stages=[
        LoadStage.vu_stage(30000, start_vus=1, end_vus=10, curve="LINEAR"),
        LoadStage.vu_stage(60000, vus=10),
    ]),
    steps=[
        LoadStep(request=HttpRequest(method="GET", path="/api/orders/$!iteration.index"),
                 think_time=Delay(time_unit="MILLISECONDS", value=20)),
    ],
)

client.run_load_scenario(scenario)          # register + start in one call
client.stop_load_scenarios("orders-ramp")

scenario = LoadScenario.new(
  name: 'orders-ramp',
  profile: LoadProfile.new(stages: [
    LoadStage.vu(30_000, start_vus: 1, end_vus: 10, curve: 'LINEAR'),
    LoadStage.vu(60_000, vus: 10)
  ]),
  steps: [
    LoadStep.new(request: HttpRequest.new(method: 'GET', path: '/api/orders/$!iteration.index'),
                 think_time: Delay.new(time_unit: 'MILLISECONDS', value: 20))
  ]
)

client.run_load_scenario(scenario)          # register + start in one call
client.stop_load_scenarios('orders-ramp')

order := mockserver.Request().Method("GET").Path("/api/orders/$!iteration.index").Build()

scenario := mockserver.LoadScenario{
    Name: "orders-ramp",
    Profile: &mockserver.LoadProfile{
        Stages: []mockserver.LoadStage{
            mockserver.RampVusStage(1, 10, 30000, mockserver.RampLinear),
            mockserver.ConstantVusStage(10, 60000),
        },
    },
    Steps: []mockserver.LoadStep{
        {Request: &order, ThinkTime: &mockserver.Delay{TimeUnit: "MILLISECONDS", Value: 20}},
    },
}

client.RunLoadScenario(scenario)            // register + start in one call
client.StopLoadScenarios("orders-ramp")

var scenario = new LoadScenario
{
    Name = "orders-ramp",
    Profile = new LoadProfile
    {
        Stages =
        {
            LoadStage.RampVus(1, 10, 30000, RampCurve.LINEAR),
            LoadStage.ConstantVus(10, 60000)
        }
    },
    Steps = new List<LoadStep>
    {
        new() { Request = HttpRequest.Request().WithMethod("GET").WithPath("/api/orders/$!iteration.index"),
                ThinkTime = new Delay { TimeUnit = TimeUnit.MILLISECONDS, Value = 20 } }
    }
};

await client.RunLoadScenarioAsync(scenario);     // register + start in one call
await client.StopLoadScenariosAsync("orders-ramp");

let profile = LoadProfile::of(vec![
    LoadStage::vu_ramp(1, 10, 30_000, RampCurve::Linear),
    LoadStage::vu_hold(10, 60_000),
]);
let steps = vec![
    LoadStep::new(HttpRequest::new().method("GET").path("/api/orders/$!iteration.index"))
        .think_time(Delay::milliseconds(20)),
];
let scenario = LoadScenario::new("orders-ramp", profile, steps);

client.run_load_scenario(&scenario).unwrap();      // register + start in one call
client.stop_load_scenarios(&["orders-ramp"]).unwrap();

$scenario = LoadScenario::scenario('orders-ramp')
    ->profile(LoadProfile::of(
        LoadStage::vuRamp(1, 10, 30000, 'LINEAR'),
        LoadStage::vuHold(10, 60000),
    ))
    ->addStep(
        HttpRequest::request()->method('GET')->path('/api/orders/$iteration.index'),
        Delay::milliseconds(20),
    );

$client->runLoadScenario($scenario);             // register + start in one call
$client->stopLoadScenarios('orders-ramp');

# Register the scenario...
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
  -H "Content-Type: application/json" \
  -d '{
    "name": "orders-ramp",
    "profile": { "stages": [
      { "type": "VU", "startVus": 1, "endVus": 10, "durationMillis": 30000, "curve": "LINEAR" },
      { "type": "VU", "vus": 10, "durationMillis": 60000 }
    ] },
    "steps": [
      { "request": { "method": "GET", "path": "/api/orders/$!iteration.index" },
        "thinkTime": { "timeUnit": "MILLISECONDS", "value": 20 } }
    ]
  }'

# ...then start it (requires loadGenerationEnabled=true)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start -d '{ "name": "orders-ramp" }'

Hold 2 VUs to warm the target up, PAUSE to let it settle, then ramp the arrival rate from 10 to 200 iterations/second and hold it. The open model starts iterations on schedule regardless of how fast the target responds — this is what exposes queue build-up and tail latency. maxVus caps the auto-scaling virtual-user pool used to meet the rate.

LoadScenario scenario = LoadScenario.loadScenario("rate-soak")
    .withProfile(LoadProfile.of(
        LoadStage.constantVus(2, 10000),
        LoadStage.pause(5000),
        LoadStage.rampRate(10, 200, 30000, RampCurve.EXPONENTIAL).withMaxVus(40),
        LoadStage.constantRate(200, 60000)
    ))
    .withSteps(
        LoadStep.loadStep(request().withMethod("GET").withPath("/health"))
    );

client.runLoadScenario(scenario);          // register + start in one call
client.stopLoadScenarios("rate-soak");

var scenario = {
    name: 'rate-soak',
    profile: { stages: [
        { type: 'VU', vus: 2, durationMillis: 10000 },
        { type: 'PAUSE', durationMillis: 5000 },
        { type: 'RATE', startRate: 10, endRate: 200, durationMillis: 30000, curve: 'EXPONENTIAL', maxVus: 40 },
        { type: 'RATE', rate: 200, durationMillis: 60000 }
    ] },
    steps: [ { request: { method: 'GET', path: '/health' } } ]
};

await client.runLoadScenario(scenario);     // register + start in one call
await client.stopLoadScenarios('rate-soak');

scenario = LoadScenario(
    name="rate-soak",
    profile=LoadProfile(stages=[
        LoadStage.vu_stage(10000, vus=2),
        LoadStage.pause_stage(5000),
        LoadStage.rate_stage(30000, start_rate=10, end_rate=200, max_vus=40, curve="EXPONENTIAL"),
        LoadStage.rate_stage(60000, rate=200),
    ]),
    steps=[LoadStep(request=HttpRequest(method="GET", path="/health"))],
)

client.run_load_scenario(scenario)          # register + start in one call
client.stop_load_scenarios("rate-soak")

scenario = LoadScenario.new(
  name: 'rate-soak',
  profile: LoadProfile.new(stages: [
    LoadStage.vu(10_000, vus: 2),
    LoadStage.pause(5_000),
    LoadStage.rate(30_000, start_rate: 10, end_rate: 200, max_vus: 40, curve: 'EXPONENTIAL'),
    LoadStage.rate(60_000, rate: 200)
  ]),
  steps: [LoadStep.new(request: HttpRequest.new(method: 'GET', path: '/health'))]
)

client.run_load_scenario(scenario)          # register + start in one call
client.stop_load_scenarios('rate-soak')

health := mockserver.Request().Method("GET").Path("/health").Build()

ramp := mockserver.RampRateStage(10, 200, 30000, mockserver.RampExponential)
maxVus := 40
ramp.MaxVus = &maxVus

scenario := mockserver.LoadScenario{
    Name: "rate-soak",
    Profile: &mockserver.LoadProfile{
        Stages: []mockserver.LoadStage{
            mockserver.ConstantVusStage(2, 10000),
            mockserver.PauseStage(5000),
            ramp,
            mockserver.ConstantRateStage(200, 60000),
        },
    },
    Steps: []mockserver.LoadStep{ {Request: &health} },
}

client.RunLoadScenario(scenario)            // register + start in one call
client.StopLoadScenarios("rate-soak")

var scenario = new LoadScenario
{
    Name = "rate-soak",
    Profile = new LoadProfile
    {
        Stages =
        {
            LoadStage.ConstantVus(2, 10000),
            LoadStage.Pause(5000),
            new LoadStage { Type = LoadStageType.RATE, StartRate = 10, EndRate = 200,
                            DurationMillis = 30000, Curve = RampCurve.EXPONENTIAL, MaxVus = 40 },
            LoadStage.ConstantRate(200, 60000)
        }
    },
    Steps = new List<LoadStep>
    {
        new() { Request = HttpRequest.Request().WithMethod("GET").WithPath("/health") }
    }
};

await client.RunLoadScenarioAsync(scenario);     // register + start in one call
await client.StopLoadScenariosAsync("rate-soak");

let profile = LoadProfile::of(vec![
    LoadStage::vu_hold(2, 10_000),
    LoadStage::pause(5_000),
    LoadStage::rate_ramp(10.0, 200.0, 30_000, RampCurve::Exponential).max_vus(40),
    LoadStage::rate_hold(200.0, 60_000),
]);
let steps = vec![
    LoadStep::new(HttpRequest::new().method("GET").path("/health")),
];
let scenario = LoadScenario::new("rate-soak", profile, steps);

client.run_load_scenario(&scenario).unwrap();      // register + start in one call
client.stop_load_scenarios(&["rate-soak"]).unwrap();

$scenario = LoadScenario::scenario('rate-soak')
    ->profile(LoadProfile::of(
        LoadStage::vuHold(2, 10000),
        LoadStage::pause(5000),
        LoadStage::rateRamp(10, 200, 30000, 'EXPONENTIAL')->maxVus(40),
        LoadStage::rateHold(200, 60000),
    ))
    ->addStep(HttpRequest::request()->method('GET')->path('/health'));

$client->runLoadScenario($scenario);             // register + start in one call
$client->stopLoadScenarios('rate-soak');

# Register the open-model soak...
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
  -H "Content-Type: application/json" \
  -d '{
    "name": "rate-soak",
    "profile": { "stages": [
      { "type": "VU", "vus": 2, "durationMillis": 10000 },
      { "type": "PAUSE", "durationMillis": 5000 },
      { "type": "RATE", "startRate": 10, "endRate": 200, "durationMillis": 30000, "curve": "EXPONENTIAL", "maxVus": 40 },
      { "type": "RATE", "rate": 200, "durationMillis": 60000 }
    ] },
    "steps": [
      { "request": { "method": "GET", "path": "/health" } }
    ]
  }'

# ...then start it (requires loadGenerationEnabled=true)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start -d '{ "name": "rate-soak" }'

Safety Caps

Load generation is bounded by configurable caps to prevent the feature from self-DoS-ing the server. Requests that exceed a cap are rejected with 400 Bad Request and a JSON error message. All caps can be raised via the corresponding configuration property.

Control	Property	Default	Enforced at
Feature flag (triggering runs)	`mockserver.loadGenerationEnabled`	`false`	`PUT /loadScenario/start` returns `403` when off (loading is always allowed)
Maximum concurrent scenarios	`mockserver.loadGenerationMaxConcurrentScenarios`	10	Trigger validation — rejected at `PUT /loadScenario/start` when it would exceed the number of active (PENDING+RUNNING) scenarios
Maximum virtual users	`mockserver.loadGenerationMaxVirtualUsers`	50	Scenario validation (rejected at PUT)
Maximum arrival rate (iterations/second)	`mockserver.loadGenerationMaxRate`	5000	Scenario validation (rejected at PUT) — applies to `RATE` stages
Maximum stages per profile	`mockserver.loadGenerationMaxStages`	20	Scenario validation (rejected at PUT)
Maximum in-flight requests	`mockserver.loadGenerationMaxInFlightRequests`	200	Live in-flight semaphore at dispatch
Maximum requests per second	`mockserver.loadGenerationMaxRequestsPerSecond`	500	Live token bucket at dispatch
Maximum total duration	`mockserver.loadGenerationMaxDurationMillis`	3 600 000 ms (1 hour)	Scenario validation — the sum of all stage durations (rejected at PUT)
Maximum steps per scenario	`mockserver.loadGenerationMaxSteps`	50	Scenario validation (rejected at PUT)

Asserting SLOs over Load Results

Every completed load-scenario request is recorded into MockServer's SLO sample store (when sloTrackingEnabled=true), so you can drive load and then assert that latency and error-rate objectives held — all from the MockServer control plane.

# 1. Enable both features
docker run \
  -e MOCKSERVER_LOAD_GENERATION_ENABLED=true \
  -e MOCKSERVER_SLO_TRACKING_ENABLED=true \
  mockserver/mockserver

# 2. Load (register) then trigger a load scenario (e.g. 10 VUs for 30 s)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
  -H "Content-Type: application/json" \
  -d '{
    "name": "slo-validation-run",
    "profile": { "stages": [ { "type": "VU", "vus": 10, "durationMillis": 30000 } ] },
    "steps": [
      {
        "request": {
          "method": "GET",
          "path": "/api/health",
          "socketAddress": { "host": "target.svc", "port": 8080, "scheme": "HTTP" }
        }
      }
    ]
  }'
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start \
  -d '{ "name": "slo-validation-run" }'

# 3. Wait for the scenario to complete (poll GET /loadScenario/slo-validation-run until state = "COMPLETED")

# 4. Assert SLOs held during the run
# Returns 200 (PASS) or 406 (FAIL)
curl -s -w "\n%{http_code}" -X PUT http://localhost:1080/mockserver/verifySLO \
  -H "Content-Type: application/json" \
  -d '{
    "name": "checkout-slo",
    "window": { "type": "LOOKBACK", "lookbackMillis": 60000 },
    "minimumSampleCount": 50,
    "upstreamHosts": ["target.svc"],
    "objectives": [
      { "sli": "LATENCY_P95", "comparator": "LESS_THAN", "threshold": 200.0 },
      { "sli": "ERROR_RATE",  "comparator": "LESS_THAN_OR_EQUAL", "threshold": 0.01 }
    ]
  }'

See SLO Resilience Verdicts for the full verifySLO reference, including window types, all SLI options, and the response body schema.

Scope note: the SLO sample store records load-scenario traffic and real proxied traffic under the same FORWARD scope, keyed by target host. When asserting SLOs over a load run, use a narrow window that covers only the load period to avoid mixing in unrelated traffic.

Observability & Metrics

Every completed load-scenario dispatch is recorded into a dedicated mock_server_load_* Prometheus metric family and mirrored over OpenTelemetry (OTLP). This lets you chart the injector's latency, throughput, and error rate alongside your system-under-test in Grafana, Datadog, or any OTEL-compatible backend — without a separate load tool.

All per-request metrics carry six fixed labels so you can slice and dice without extra queries:

Label	Value
`scenario`	The scenario `name`
`run_id`	A UUID generated at scenario start — stable for one run, resets on each PUT. Use it to filter metrics to exactly one execution.
`step`	Step index (0-based) or the step `name` when set
`route`	Auto-templatized path (numeric and UUID segments become `{id}`, e.g. `/api/orders/{id}`) or the step `name` when set
`method`	HTTP method (`GET`, `POST`, …)
`status_class`	Response status class: `2xx`, `3xx`, `4xx`, `5xx`, or `unknown`

The full metric catalogue:

Metric	Type	Description
`mock_server_load_request_duration_seconds`	Histogram	Round-trip latency per dispatch. Query any percentile with `histogram_quantile` in Prometheus or the equivalent in your OTEL backend.
`mock_server_load_requests`	Counter	Completed dispatches
`mock_server_load_request_bytes`	Counter	Outbound request bytes
`mock_server_load_response_bytes`	Counter	Inbound response bytes
`mock_server_load_iterations`	Counter	Full VU iteration completions (labelled by `scenario` + `run_id` only)
`mock_server_load_throttled`	Counter	Dispatches skipped by the self-load guard. Label `reason` = `inflight_cap` or `rate_limit`. A rising value means the scenario could not reach its setpoint.
`mock_server_load_errors`	Counter	Failed dispatches. Label `kind` = `render`, `connection`, `timeout`, `null_response`, or `http_5xx`.
`mock_server_load_active_vus`	Gauge	Virtual users currently running
`mock_server_load_inflight_requests`	Gauge	Dispatches currently in flight

Adding custom labels

Attach domain dimensions (environment, region, team) to metric series using the labels field on the scenario or step:

# Enable load generation with a custom label allowlist (Prometheus requires this at startup)
docker run \
  -e MOCKSERVER_LOAD_GENERATION_ENABLED=true \
  -e MOCKSERVER_LOAD_GENERATION_METRIC_LABELS=env,region \
  mockserver/mockserver

# Scenario with custom labels
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
  -H "Content-Type: application/json" \
  -d '{
    "name": "checkout-load",
    "labels": { "env": "staging", "region": "eu-west-1" },
    "profile": { "stages": [ { "type": "VU", "vus": 10, "durationMillis": 30000 } ] },
    "steps": [
      {
        "name": "get-order",
        "labels": { "team": "orders" },
        "request": { "method": "GET", "path": "/api/orders/$iteration.index",
                     "socketAddress": { "host": "orders.svc", "port": 8080 } }
      }
    ]
  }'

Prometheus: Only label keys listed in mockserver.loadGenerationMetricLabels (set at startup) appear as Prometheus labels. This is a Prometheus requirement — the schema is fixed at registration time.
OpenTelemetry: All custom label keys are forwarded as OTEL attributes with no allowlist needed.

Exemplars / trace pivoting

When your system under test propagates W3C Trace Context, the load-scenario latency histogram (mock_server_load_request_duration_seconds) attaches the upstream trace_id from the response's traceparent header as a Prometheus exemplar. In Grafana, this lets you click a latency spike and jump directly to the trace that caused it.

Example PromQL

# p95 request latency for a specific run
histogram_quantile(0.95,
  sum by (le, scenario, run_id) (
    rate(mock_server_load_request_duration_seconds_bucket[1m])
  )
)

# Error rate by kind
sum by (kind) (rate(mock_server_load_errors_total[1m]))

# Is the scenario being throttled?
rate(mock_server_load_throttled_total[1m])

Metrics require metricsEnabled=true. See Observability for the full Prometheus and OTEL configuration reference.

Configuration

Property	Environment variable	Default	Description
`mockserver.loadGenerationEnabled`	`MOCKSERVER_LOAD_GENERATION_ENABLED`	`false`	Master switch. Must be `true` for `PUT /mockserver/loadScenario` to succeed. Set at startup to avoid a restart.
`mockserver.loadGenerationMaxVirtualUsers`	`MOCKSERVER_LOAD_GENERATION_MAX_VIRTUAL_USERS`	50	Maximum concurrent virtual users allowed in a scenario. Raise for higher-concurrency load runs.
`mockserver.loadGenerationMaxInFlightRequests`	`MOCKSERVER_LOAD_GENERATION_MAX_IN_FLIGHT_REQUESTS`	200	Maximum dispatches allowed in flight simultaneously. Acts as a semaphore — dispatches that would exceed this are counted in `mock_server_load_throttled` with reason `inflight_cap`.
`mockserver.loadGenerationMaxRequestsPerSecond`	`MOCKSERVER_LOAD_GENERATION_MAX_REQUESTS_PER_SECOND`	500	Maximum dispatches per second (token bucket). Dispatches that would exceed this are counted in `mock_server_load_throttled` with reason `rate_limit`.
`mockserver.loadGenerationMaxDurationMillis`	`MOCKSERVER_LOAD_GENERATION_MAX_DURATION_MILLIS`	3 600 000 (1 hour)	Maximum scenario duration in milliseconds. Scenarios with a longer duration are rejected at PUT.
`mockserver.loadGenerationMaxSteps`	`MOCKSERVER_LOAD_GENERATION_MAX_STEPS`	50	Maximum number of steps per scenario. Scenarios with more steps are rejected at PUT.
`mockserver.loadGenerationMetricLabels`	`MOCKSERVER_LOAD_GENERATION_METRIC_LABELS`	`""`	Comma-separated list of custom label keys to register in Prometheus (e.g. `env,region,team`). Must be set before the first scenario runs. OpenTelemetry always receives all custom labels regardless of this setting.

Chaos Testing & Fault Injection — inject faults alongside load to verify resilience under stress
SLO Resilience Verdicts — assert that latency and error-rate objectives held during the load run
Response Templates — the Velocity / Mustache template engines used for path and body rendering in steps
Observability — Prometheus and OpenTelemetry configuration; the mock_server_load_* family feeds into the same pipeline
Observability & Metrics — load-specific metrics, custom labels, exemplars, and PromQL examples (this page)