Load Injection (Load Scenarios)
MockServer can drive outbound API traffic on demand using Load Scenarios — a declarative, bounded load generator built into the server. A load scenario is an ordered list of templated request steps driven through a sequence of stages (a load profile), with per-iteration variable data. Each stage either holds or ramps the number of concurrent virtual users (VU, closed model), holds or ramps an arrival rate in iterations per second (RATE, open model), or pauses. Results feed both the metrics histograms and the SLO sample store so a load run can be asserted with PUT /mockserver/verifySLO.
This makes MockServer useful for resilience verification: inject a chaos fault (see Chaos Testing), apply load, then assert that your SLOs held — all from a single control plane, with no external load tool required.
Registry model: load, then trigger. Scenarios are organised as a registry of named scenarios. You first load (register) a scenario by name with PUT /mockserver/loadScenario — this does not run it — then trigger one or many by name with PUT /mockserver/loadScenario/start to run them concurrently, each with its own optional start delay. Loading is always allowed; triggering a run is off by default: start returns 403 Forbidden until loadGenerationEnabled=true. Hard caps on concurrent scenarios, virtual users, in-flight requests, RPS, duration, and step count prevent the feature from self-DoS-ing the server. Scenarios can be preloaded at startup from a JSON file via mockserver.loadScenarioInitializationJsonPath.
Quickstart
Enable load generation, load a 10-VU ramp-then-hold scenario, then trigger it:
# 1. Enable load generation (set once at startup or via the config endpoint)
docker run -e MOCKSERVER_LOAD_GENERATION_ENABLED=true mockserver/mockserver
# 2. LOAD (register) a scenario — this does NOT run it
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
-H "Content-Type: application/json" \
-d '{
"name": "checkout-load",
"templateType": "VELOCITY",
"startDelayMillis": 0,
"profile": {
"stages": [
{ "type": "VU", "startVus": 1, "endVus": 10, "durationMillis": 30000, "curve": "LINEAR" },
{ "type": "VU", "vus": 10, "durationMillis": 60000 }
]
},
"steps": [
{
"request": {
"method": "GET",
"path": "/api/orders/$iteration.index",
"headers": { "Host": ["orders.svc:8080"] },
"socketAddress": { "host": "orders.svc", "port": 8080, "scheme": "HTTP" }
}
}
]
}'
# 3. TRIGGER it to run (one or many names at once)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start \
-H "Content-Type: application/json" \
-d '{ "name": "checkout-load" }'
# 4. Poll progress (one scenario, or list all)
curl -s http://localhost:1080/mockserver/loadScenario/checkout-load
curl -s http://localhost:1080/mockserver/loadScenario
# 5. Stop it (stays registered, STOPPED — can be re-triggered)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/stop \
-d '{ "name": "checkout-load" }'
Control-Plane API
All endpoints are control-plane endpoints — they respect control-plane authentication (mTLS / JWT) when configured. The flow is load (register) → trigger (run) by name.
| Verb | Path | Behaviour |
|---|---|---|
PUT |
/mockserver/loadScenario |
Load (register) a scenario by name — it is staged in the LOADED state but does not run. Allowed even when loadGenerationEnabled=false (no traffic). 400 with a JSON error when invalid or a cap is exceeded; 200 {status:"loaded", name, state:"LOADED"} otherwise. Loading the same name replaces the prior definition. |
GET |
/mockserver/loadScenario |
List all registered scenarios: { scenarios: [ { name, state, startDelayMillis, definition, …live status fields } ] }. state ∈ LOADED / PENDING / RUNNING / COMPLETED / STOPPED. Live fields (when active/run): stageIndex, stageType, currentTarget, currentVus, requestsSent, succeeded, failed, p50/p95/p99Millis, runId, startedAt, endedAt. |
GET |
/mockserver/loadScenario/{name} |
Return one registered scenario (definition + state + status). 404 if not registered. |
PUT |
/mockserver/loadScenario/start |
Trigger one or more registered scenarios to run concurrently. Body {"names":["a","b"]} or {"name":"a"}. Each honours its own startDelayMillis. Requires loadGenerationEnabled=true (else 403); 404 if a name isn't registered; 400 if it would exceed loadGenerationMaxConcurrentScenarios. Returns the triggered names and resulting states (PENDING / RUNNING). |
PUT |
/mockserver/loadScenario/stop |
Stop running scenario(s). Body {"names":[…]}, {"all":true}, or empty (stop all). Stopped scenarios stay registered (STOPPED) and can be re-triggered. |
DELETE |
/mockserver/loadScenario/{name} |
Remove one scenario from the registry (stops it first if running). |
DELETE |
/mockserver/loadScenario |
Clear the whole registry (stops all running). Idempotent. |
Trigger several at once — each begins at its own offset via startDelayMillis:
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start \
-d '{ "names": ["checkout-load", "background-poller"] }'
Load Scenario Model
A load scenario is a JSON object with the following top-level fields:
| Field | Type | Required | Description |
|---|---|---|---|
name |
string | Yes | The scenario name — the unique registry key. Loading the same name replaces the prior definition. Appears in status, metric labels, and logs. |
startDelayMillis |
integer | No | Delay in milliseconds, applied after the scenario is triggered, before its iterations begin (default 0). A positive value means the scenario is PENDING until the delay elapses, then RUNNING. Lets several triggered scenarios begin at staggered offsets. |
steps |
array | Yes | Ordered list of request steps (see Steps below). Maximum 50 steps. |
profile |
object | Yes | The load profile — an ordered list of stages run in sequence (see Load Profile below). |
templateType |
string | No | Template engine for rendering step fields: VELOCITY (default) or MUSTACHE. JAVASCRIPT is not supported for load steps and is rejected with a 400. |
maxRequests |
integer | No | Stop the scenario once this many requests have been dispatched, even if the duration has not elapsed. Useful for budget-bounded runs. |
labels |
object | No | Scenario-level custom metric labels (string key/value pairs). Keys must be listed in mockserver.loadGenerationMetricLabels to appear in Prometheus; all keys appear in OpenTelemetry automatically. See Observability & Metrics. |
Steps
Each step defines a request to fire and an optional pause after it:
| Field | Type | Description |
|---|---|---|
request |
HttpRequest | The request to fire. Reuses the same HttpRequest model as expectations. Template expressions live in the path and body fields. Set socketAddress to point at the target service. |
thinkTime |
Delay | Optional pause between this step and the next (a { "timeUnit": "MILLISECONDS", "value": 100 } object). Implemented with a non-blocking scheduler — no thread is blocked during think time. |
name |
string | Optional step name. When set, it is used as the route metric label for this step instead of the auto-templatized request path. Use this to group requests to multiple paths under one label, or to give a step a human-readable name in metric charts. |
labels |
object | Optional step-level custom metric labels (string key/value pairs). Override the scenario-level labels for this step. Same Prometheus allowlist requirement applies. |
Load Profile
The profile is an ordered list of stages (maximum 20) run one after another. The total run length is the sum of the stage durations. Each stage is one of three types:
VU(closed model) — hold or ramp the number of concurrent virtual users. Each VU loops the steps back-to-back; throughput is whatever the target can sustain. Answers "how does my service behave with N concurrent clients?"RATE(open model) — hold or ramp an arrival rate in iterations per second. MockServer starts new iterations on schedule (auto-scaling the virtual-user pool to run them) regardless of how fast the target responds. Answers "how does my service behave at R requests/second?" — this is the model that exposes queue build-up and tail latency.PAUSE— drive no load for the duration (virtual users drain). Use it to separate stages, e.g. let the target recover between a spike and a soak.
A stage holds a value (set vus / rate) or ramps between two values (set startVus+endVus / startRate+endRate). A ramp follows a curve:
LINEAR(default) — straight line.QUADRATIC— ease-in: slow at first, then accelerating.EXPONENTIAL— a steeper ease-in (handles ramps starting from zero correctly).
| Field | Type | Description |
|---|---|---|
type |
string | VU, RATE or PAUSE. |
durationMillis |
integer | How long this stage runs (milliseconds, > 0). The total across all stages must not exceed 3 600 000 ms (1 hour). |
curve |
string | Ramp shape for a ramping stage: LINEAR (default), QUADRATIC or EXPONENTIAL. Ignored for holds and pauses. |
vus |
integer | VU hold — number of concurrent virtual users to hold. Maximum 50. |
startVus / endVus |
integer | VU ramp — virtual users at the start and end of the ramp. Maximum 50. |
rate |
number | RATE hold — arrival rate in iterations per second to hold. Maximum 5000. |
startRate / endRate |
number | RATE ramp — arrival rate (iterations/second) at the start and end of the ramp. Maximum 5000. |
maxVus |
integer | RATE only — optional cap on the auto-scaling virtual-user pool used to run the started iterations (defaults to the global virtual-user cap of 50). If the rate cannot be met within this cap, the shortfall is recorded as a rate_limit throttle. |
Per-Iteration Template Variables
Each iteration gets a fresh iteration context injected alongside the standard request variable. Use it to vary data across iterations and virtual users without external data files:
| Variable | Meaning | Velocity | Mustache |
|---|---|---|---|
index |
Global iteration index across all VUs (0-based) | $iteration.index |
|
vuId |
ID of the virtual user running this iteration (0-based) | $iteration.vuId |
|
vuIteration |
Iteration count within this specific VU (0-based) | $iteration.vuIteration |
|
elapsedMillis |
Milliseconds since the scenario started | $iteration.elapsedMillis |
|
count |
Total requests dispatched so far (across all VUs) | $iteration.count |
|
What fields are rendered. In v1, only the request path and body fields are rendered through the template engine. These are the most commonly templated fields — use them to vary the URL path or the request payload across iterations.
Examples
Each example below shows the same scenario across every MockServer client and the plain REST API. Expand a language to see how to build the load scenario, register it, start it, read its live status, and stop it.
The following examples drive load scenarios through MockServer using each client library and the plain REST API. They all follow the same registry workflow — register → start → read live status → stop. Load generation must be enabled (loadGenerationEnabled=true): registering is always allowed, but starting a run returns 403 when it is off.
A realistic multi-stage scenario: a linear RATE ramp (5 → 50 req/s, capped at 50 virtual users), then a 25-VU hold, then a PAUSE. Two Velocity-templated steps drive each iteration, startDelayMillis defers load briefly after start, and custom labels tag the metric series. The full lifecycle is exercised: register (does not run), start, list / read live status, stop, then clear the registry.
import org.mockserver.client.MockServerClient;
import org.mockserver.load.*;
import org.mockserver.model.Delay;
import org.mockserver.model.HttpTemplate;
import java.util.Map;
import java.util.concurrent.TimeUnit;
import static org.mockserver.model.HttpRequest.request;
MockServerClient client = new MockServerClient("localhost", 1080);
LoadScenario scenario = LoadScenario.loadScenario("checkout-load")
.withTemplateType(HttpTemplate.TemplateType.VELOCITY)
.withMaxRequests(100000)
.withStartDelayMillis(500)
.withLabels(Map.of("team", "payments", "env", "staging"))
.withProfile(LoadProfile.of(
LoadStage.rampRate(5, 50, 30000, RampCurve.LINEAR).withMaxVus(50),
LoadStage.constantVus(25, 60000),
LoadStage.pause(10000)
))
.withSteps(
LoadStep.loadStep(request().withMethod("GET").withPath("/products/$!iteration.index"))
.withName("browse")
.withThinkTime(new Delay(TimeUnit.MILLISECONDS, 500)),
LoadStep.loadStep(request().withMethod("POST").withPath("/cart/checkout")
.withBody("{\"item\":\"$!iteration.index\",\"qty\":1}"))
.withName("checkout")
.withLabels(Map.of("critical", "true"))
);
client.loadScenario(scenario); // 1. register (does NOT start it yet)
client.startLoadScenarios("checkout-load"); // 2. start (requires loadGenerationEnabled=true)
String listing = client.loadScenarios(); // 3. list all registered scenarios
String status = client.getLoadScenario("checkout-load"); // live throughput / latency status
client.stopLoadScenarios("checkout-load"); // 4. stop (no args stops ALL running scenarios)
client.clearLoadScenarios(); // tidy up the registry
var mockServerClient = require('mockserver-client').mockServerClient;
var client = mockServerClient("localhost", 1080);
// The per-step field is `request` (a full HttpRequest), not `httpRequest`.
var scenario = {
name: 'checkout-load',
templateType: 'VELOCITY',
maxRequests: 100000,
startDelayMillis: 500,
labels: { team: 'payments', env: 'staging' },
profile: {
stages: [
{ type: 'RATE', startRate: 5, endRate: 50, durationMillis: 30000, curve: 'LINEAR', maxVus: 50 },
{ type: 'VU', vus: 25, durationMillis: 60000 },
{ type: 'PAUSE', durationMillis: 10000 }
]
},
steps: [
{ name: 'browse', request: { method: 'GET', path: '/products/$!iteration.index' },
thinkTime: { timeUnit: 'MILLISECONDS', value: 500 } },
{ name: 'checkout', labels: { critical: 'true' },
request: { method: 'POST', path: '/cart/checkout',
headers: { 'Content-Type': ['application/json'] },
body: '{"item":"$!iteration.index","qty":1}' } }
]
};
(async function () {
await client.loadScenario(scenario); // 1. register (does NOT start it yet)
await client.startLoadScenarios('checkout-load'); // 2. start (requires loadGenerationEnabled=true)
var listing = await client.loadScenarios(); // 3. list all registered scenarios
var status = await client.getLoadScenario('checkout-load'); // live status
await client.stopLoadScenarios('checkout-load'); // 4. stop (no arg stops ALL running scenarios)
await client.clearLoadScenarios(); // tidy up the registry
})();
from mockserver import (Delay, HttpRequest, LoadProfile, LoadScenario,
LoadStage, LoadStep, MockServerClient)
scenario = LoadScenario(
name="checkout-load",
template_type="VELOCITY",
max_requests=100000,
start_delay_millis=500,
labels={"team": "payments", "env": "staging"},
profile=LoadProfile(stages=[
LoadStage.rate_stage(30000, start_rate=5, end_rate=50, max_vus=50, curve="LINEAR"),
LoadStage.vu_stage(60000, vus=25),
LoadStage.pause_stage(10000),
]),
steps=[
LoadStep(name="browse",
request=HttpRequest(method="GET", path="/products/$!iteration.index"),
think_time=Delay(time_unit="MILLISECONDS", value=500)),
LoadStep(name="checkout", labels={"critical": "true"},
request=HttpRequest(method="POST", path="/cart/checkout",
body='{"item":"$!iteration.index","qty":1}')),
],
)
with MockServerClient("localhost", 1080) as client:
client.load_scenario(scenario) # 1. register (does NOT start it yet)
client.start_load_scenarios("checkout-load") # 2. start (requires loadGenerationEnabled=true)
listing = client.load_scenarios() # 3. list all registered scenarios
status = client.get_load_scenario("checkout-load") # live status
client.stop_load_scenarios("checkout-load") # 4. stop (None stops ALL running scenarios)
client.clear_load_scenarios() # tidy up the registry
require 'mockserver-client'
include MockServer
client = Client.new('localhost', 1080)
scenario = LoadScenario.new(
name: 'checkout-load',
template_type: 'VELOCITY',
max_requests: 100_000,
start_delay_millis: 500,
labels: { 'team' => 'payments', 'env' => 'staging' },
profile: LoadProfile.new(stages: [
LoadStage.rate(30_000, start_rate: 5, end_rate: 50, max_vus: 50, curve: 'LINEAR'),
LoadStage.vu(60_000, vus: 25),
LoadStage.pause(10_000)
]),
steps: [
LoadStep.new(name: 'browse',
request: HttpRequest.new(method: 'GET', path: '/products/$!iteration.index'),
think_time: Delay.new(time_unit: 'MILLISECONDS', value: 500)),
LoadStep.new(name: 'checkout', labels: { 'critical' => 'true' },
request: HttpRequest.new(method: 'POST', path: '/cart/checkout',
body: '{"item":"$!iteration.index","qty":1}'))
]
)
client.load_scenario(scenario) # 1. register (does NOT start it yet)
client.start_load_scenarios('checkout-load') # 2. start (requires loadGenerationEnabled=true)
client.load_scenarios # 3. list all registered scenarios
client.get_load_scenario('checkout-load') # live status
client.stop_load_scenarios('checkout-load') # 4. stop (nil stops ALL running scenarios)
client.clear_load_scenarios # tidy up the registry
client.close
package main
import (
mockserver "github.com/mock-server/mockserver-monorepo/mockserver-client-go"
)
func main() {
client := mockserver.New("localhost", 1080)
browse := mockserver.Request().Method("GET").Path("/products/$!iteration.index").Build()
checkout := mockserver.Request().Method("POST").Path("/cart/checkout").
Body(`{"item":"$!iteration.index","qty":1}`).Build()
// MaxVus is an optional *int field on a RATE stage.
rampStage := mockserver.RampRateStage(5, 50, 30000, mockserver.RampLinear)
maxVus := 50
rampStage.MaxVus = &maxVus
scenario := mockserver.LoadScenario{
Name: "checkout-load",
TemplateType: "VELOCITY",
MaxRequests: 100000,
StartDelayMillis: 500,
Labels: map[string]string{"team": "payments", "env": "staging"},
Profile: &mockserver.LoadProfile{
Stages: []mockserver.LoadStage{
rampStage,
mockserver.ConstantVusStage(25, 60000),
mockserver.PauseStage(10000),
},
},
Steps: []mockserver.LoadStep{
{Name: "browse", Request: &browse, ThinkTime: &mockserver.Delay{TimeUnit: "MILLISECONDS", Value: 500}},
{Name: "checkout", Request: &checkout, Labels: map[string]string{"critical": "true"}},
},
}
client.LoadScenario(scenario) // 1. register (does NOT start it yet)
client.StartLoadScenarios("checkout-load") // 2. start (requires loadGenerationEnabled=true)
client.LoadScenarios() // 3. list all registered scenarios
client.GetLoadScenario("checkout-load") // live status
client.StopLoadScenarios("checkout-load") // 4. stop (no args stops ALL running scenarios)
client.ClearLoadScenarios() // tidy up the registry
}
using MockServer.Client;
using MockServer.Client.Models;
using var client = new MockServerClient("localhost", 1080);
var scenario = new LoadScenario
{
Name = "checkout-load",
TemplateType = LoadTemplateType.VELOCITY,
MaxRequests = 100000,
StartDelayMillis = 500,
Labels = new Dictionary<string, string> { ["team"] = "payments", ["env"] = "staging" },
Profile = new LoadProfile
{
Stages =
{
new LoadStage { Type = LoadStageType.RATE, StartRate = 5, EndRate = 50,
DurationMillis = 30000, Curve = RampCurve.LINEAR, MaxVus = 50 },
LoadStage.ConstantVus(25, 60000),
LoadStage.Pause(10000)
}
},
Steps = new List<LoadStep>
{
new() { Name = "browse",
Request = HttpRequest.Request().WithMethod("GET").WithPath("/products/$!iteration.index"),
ThinkTime = new Delay { TimeUnit = TimeUnit.MILLISECONDS, Value = 500 } },
new() { Name = "checkout",
Request = HttpRequest.Request().WithMethod("POST").WithPath("/cart/checkout")
.WithBody("{\"item\":\"$!iteration.index\",\"qty\":1}"),
Labels = new Dictionary<string, string> { ["critical"] = "true" } }
}
};
await client.LoadScenarioAsync(scenario); // 1. register (does NOT start it yet)
await client.StartLoadScenariosAsync("checkout-load"); // 2. start (requires loadGenerationEnabled=true)
var listing = await client.LoadScenariosAsync(); // 3. list all registered scenarios
var status = await client.GetLoadScenarioAsync("checkout-load"); // live status
await client.StopLoadScenariosAsync("checkout-load"); // 4. stop (no args stops ALL running scenarios)
await client.ClearLoadScenariosAsync(); // tidy up the registry
use mockserver_client::{
ClientBuilder, Delay, HttpRequest, LoadProfile, LoadScenario, LoadStage, LoadStep, RampCurve,
};
let client = ClientBuilder::new("localhost", 1080).build().unwrap();
let profile = LoadProfile::of(vec![
LoadStage::rate_ramp(5.0, 50.0, 30_000, RampCurve::Linear).max_vus(50),
LoadStage::vu_hold(25, 60_000),
LoadStage::pause(10_000),
]);
let steps = vec![
LoadStep::new(HttpRequest::new().method("GET").path("/products/$!iteration.index"))
.think_time(Delay::milliseconds(500)),
LoadStep::new(HttpRequest::new().method("POST").path("/cart/checkout")
.body(r#"{"item":"$!iteration.index","qty":1}"#)),
];
let scenario = LoadScenario::new("checkout-load", profile, steps)
.template_type("VELOCITY")
.max_requests(100_000)
.start_delay_millis(500);
client.load_scenario(&scenario).unwrap(); // 1. register (does NOT start it yet)
client.start_load_scenarios(&["checkout-load"]).unwrap(); // 2. start (requires loadGenerationEnabled=true)
client.load_scenarios().unwrap(); // 3. list all registered scenarios
client.get_load_scenario("checkout-load").unwrap(); // live status
client.stop_load_scenarios(&["checkout-load"]).unwrap(); // 4. stop (&[] stops ALL running scenarios)
client.clear_load_scenarios().unwrap(); // tidy up the registry
require_once 'vendor/autoload.php';
use MockServer\Delay;
use MockServer\HttpRequest;
use MockServer\LoadProfile;
use MockServer\LoadScenario;
use MockServer\LoadStage;
use MockServer\MockServerClient;
$client = new MockServerClient('localhost', 1080);
$scenario = LoadScenario::scenario('checkout-load')
->templateType('VELOCITY')
->maxRequests(100000)
->startDelayMillis(500)
->labels(['team' => 'payments', 'env' => 'staging'])
->profile(LoadProfile::of(
LoadStage::rateRamp(5, 50, 30000, 'LINEAR')->maxVus(50),
LoadStage::vuHold(25, 60000),
LoadStage::pause(10000),
))
->addStep(
HttpRequest::request()->method('GET')->path('/products/$iteration.index'),
Delay::milliseconds(500),
'browse',
)
->addStep(
HttpRequest::request()->method('POST')->path('/cart/checkout')
->body('{"item":"$iteration.index","qty":1}'),
null,
'checkout',
['critical' => 'true'],
);
$client->loadScenario($scenario); // 1. register (does NOT start it yet)
$client->startLoadScenarios('checkout-load'); // 2. start (requires loadGenerationEnabled=true)
$client->loadScenarios(); // 3. list all registered scenarios
$client->getLoadScenario('checkout-load'); // live status
$client->stopLoadScenarios('checkout-load'); // 4. stop (null stops ALL running scenarios)
$client->clearLoadScenarios(); // tidy up the registry
# Start the server with load generation enabled:
# docker run -e MOCKSERVER_LOAD_GENERATION_ENABLED=true mockserver/mockserver
# 1. REGISTER (does NOT run it) — PUT /mockserver/loadScenario
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
-H "Content-Type: application/json" \
-d '{
"name": "checkout-load",
"templateType": "VELOCITY",
"maxRequests": 100000,
"startDelayMillis": 500,
"labels": { "team": "payments", "env": "staging" },
"profile": { "stages": [
{ "type": "RATE", "startRate": 5, "endRate": 50, "durationMillis": 30000, "curve": "LINEAR", "maxVus": 50 },
{ "type": "VU", "vus": 25, "durationMillis": 60000 },
{ "type": "PAUSE", "durationMillis": 10000 }
] },
"steps": [
{ "name": "browse", "request": { "method": "GET", "path": "/products/$!iteration.index" },
"thinkTime": { "timeUnit": "MILLISECONDS", "value": 500 } },
{ "name": "checkout", "labels": { "critical": "true" },
"request": { "method": "POST", "path": "/cart/checkout",
"body": "{\"item\":\"$!iteration.index\",\"qty\":1}" } }
]
}'
# 2. START it (requires loadGenerationEnabled=true; else 403)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start \
-d '{ "name": "checkout-load" }'
# 3. LIST all registered scenarios, and read one scenario's live status
curl -s http://localhost:1080/mockserver/loadScenario
curl -s http://localhost:1080/mockserver/loadScenario/checkout-load
# 4. STOP it (stays registered, STOPPED — can be re-triggered)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/stop \
-d '{ "name": "checkout-load" }'
Ramp from 1 to 10 concurrent virtual users over 30 seconds, then hold 10 VUs for a minute. Each iteration fetches a different order derived from the global iteration index. runLoadScenario registers and starts in a single call (so it still requires loadGenerationEnabled=true).
LoadScenario scenario = LoadScenario.loadScenario("orders-ramp")
.withProfile(LoadProfile.of(
LoadStage.rampVus(1, 10, 30000, RampCurve.LINEAR),
LoadStage.constantVus(10, 60000)
))
.withSteps(
LoadStep.loadStep(request().withMethod("GET").withPath("/api/orders/$!iteration.index"))
.withThinkTime(new Delay(TimeUnit.MILLISECONDS, 20))
);
client.runLoadScenario(scenario); // register + start in one call
client.stopLoadScenarios("orders-ramp"); // stop when done
var scenario = {
name: 'orders-ramp',
profile: { stages: [
{ type: 'VU', startVus: 1, endVus: 10, durationMillis: 30000, curve: 'LINEAR' },
{ type: 'VU', vus: 10, durationMillis: 60000 }
] },
steps: [
{ request: { method: 'GET', path: '/api/orders/$!iteration.index' },
thinkTime: { timeUnit: 'MILLISECONDS', value: 20 } }
]
};
await client.runLoadScenario(scenario); // register + start in one call
await client.stopLoadScenarios('orders-ramp');
scenario = LoadScenario(
name="orders-ramp",
profile=LoadProfile(stages=[
LoadStage.vu_stage(30000, start_vus=1, end_vus=10, curve="LINEAR"),
LoadStage.vu_stage(60000, vus=10),
]),
steps=[
LoadStep(request=HttpRequest(method="GET", path="/api/orders/$!iteration.index"),
think_time=Delay(time_unit="MILLISECONDS", value=20)),
],
)
client.run_load_scenario(scenario) # register + start in one call
client.stop_load_scenarios("orders-ramp")
scenario = LoadScenario.new(
name: 'orders-ramp',
profile: LoadProfile.new(stages: [
LoadStage.vu(30_000, start_vus: 1, end_vus: 10, curve: 'LINEAR'),
LoadStage.vu(60_000, vus: 10)
]),
steps: [
LoadStep.new(request: HttpRequest.new(method: 'GET', path: '/api/orders/$!iteration.index'),
think_time: Delay.new(time_unit: 'MILLISECONDS', value: 20))
]
)
client.run_load_scenario(scenario) # register + start in one call
client.stop_load_scenarios('orders-ramp')
order := mockserver.Request().Method("GET").Path("/api/orders/$!iteration.index").Build()
scenario := mockserver.LoadScenario{
Name: "orders-ramp",
Profile: &mockserver.LoadProfile{
Stages: []mockserver.LoadStage{
mockserver.RampVusStage(1, 10, 30000, mockserver.RampLinear),
mockserver.ConstantVusStage(10, 60000),
},
},
Steps: []mockserver.LoadStep{
{Request: &order, ThinkTime: &mockserver.Delay{TimeUnit: "MILLISECONDS", Value: 20}},
},
}
client.RunLoadScenario(scenario) // register + start in one call
client.StopLoadScenarios("orders-ramp")
var scenario = new LoadScenario
{
Name = "orders-ramp",
Profile = new LoadProfile
{
Stages =
{
LoadStage.RampVus(1, 10, 30000, RampCurve.LINEAR),
LoadStage.ConstantVus(10, 60000)
}
},
Steps = new List<LoadStep>
{
new() { Request = HttpRequest.Request().WithMethod("GET").WithPath("/api/orders/$!iteration.index"),
ThinkTime = new Delay { TimeUnit = TimeUnit.MILLISECONDS, Value = 20 } }
}
};
await client.RunLoadScenarioAsync(scenario); // register + start in one call
await client.StopLoadScenariosAsync("orders-ramp");
let profile = LoadProfile::of(vec![
LoadStage::vu_ramp(1, 10, 30_000, RampCurve::Linear),
LoadStage::vu_hold(10, 60_000),
]);
let steps = vec![
LoadStep::new(HttpRequest::new().method("GET").path("/api/orders/$!iteration.index"))
.think_time(Delay::milliseconds(20)),
];
let scenario = LoadScenario::new("orders-ramp", profile, steps);
client.run_load_scenario(&scenario).unwrap(); // register + start in one call
client.stop_load_scenarios(&["orders-ramp"]).unwrap();
$scenario = LoadScenario::scenario('orders-ramp')
->profile(LoadProfile::of(
LoadStage::vuRamp(1, 10, 30000, 'LINEAR'),
LoadStage::vuHold(10, 60000),
))
->addStep(
HttpRequest::request()->method('GET')->path('/api/orders/$iteration.index'),
Delay::milliseconds(20),
);
$client->runLoadScenario($scenario); // register + start in one call
$client->stopLoadScenarios('orders-ramp');
# Register the scenario...
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
-H "Content-Type: application/json" \
-d '{
"name": "orders-ramp",
"profile": { "stages": [
{ "type": "VU", "startVus": 1, "endVus": 10, "durationMillis": 30000, "curve": "LINEAR" },
{ "type": "VU", "vus": 10, "durationMillis": 60000 }
] },
"steps": [
{ "request": { "method": "GET", "path": "/api/orders/$!iteration.index" },
"thinkTime": { "timeUnit": "MILLISECONDS", "value": 20 } }
]
}'
# ...then start it (requires loadGenerationEnabled=true)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start -d '{ "name": "orders-ramp" }'
Hold 2 VUs to warm the target up, PAUSE to let it settle, then ramp the arrival rate from 10 to 200 iterations/second and hold it. The open model starts iterations on schedule regardless of how fast the target responds — this is what exposes queue build-up and tail latency. maxVus caps the auto-scaling virtual-user pool used to meet the rate.
LoadScenario scenario = LoadScenario.loadScenario("rate-soak")
.withProfile(LoadProfile.of(
LoadStage.constantVus(2, 10000),
LoadStage.pause(5000),
LoadStage.rampRate(10, 200, 30000, RampCurve.EXPONENTIAL).withMaxVus(40),
LoadStage.constantRate(200, 60000)
))
.withSteps(
LoadStep.loadStep(request().withMethod("GET").withPath("/health"))
);
client.runLoadScenario(scenario); // register + start in one call
client.stopLoadScenarios("rate-soak");
var scenario = {
name: 'rate-soak',
profile: { stages: [
{ type: 'VU', vus: 2, durationMillis: 10000 },
{ type: 'PAUSE', durationMillis: 5000 },
{ type: 'RATE', startRate: 10, endRate: 200, durationMillis: 30000, curve: 'EXPONENTIAL', maxVus: 40 },
{ type: 'RATE', rate: 200, durationMillis: 60000 }
] },
steps: [ { request: { method: 'GET', path: '/health' } } ]
};
await client.runLoadScenario(scenario); // register + start in one call
await client.stopLoadScenarios('rate-soak');
scenario = LoadScenario(
name="rate-soak",
profile=LoadProfile(stages=[
LoadStage.vu_stage(10000, vus=2),
LoadStage.pause_stage(5000),
LoadStage.rate_stage(30000, start_rate=10, end_rate=200, max_vus=40, curve="EXPONENTIAL"),
LoadStage.rate_stage(60000, rate=200),
]),
steps=[LoadStep(request=HttpRequest(method="GET", path="/health"))],
)
client.run_load_scenario(scenario) # register + start in one call
client.stop_load_scenarios("rate-soak")
scenario = LoadScenario.new(
name: 'rate-soak',
profile: LoadProfile.new(stages: [
LoadStage.vu(10_000, vus: 2),
LoadStage.pause(5_000),
LoadStage.rate(30_000, start_rate: 10, end_rate: 200, max_vus: 40, curve: 'EXPONENTIAL'),
LoadStage.rate(60_000, rate: 200)
]),
steps: [LoadStep.new(request: HttpRequest.new(method: 'GET', path: '/health'))]
)
client.run_load_scenario(scenario) # register + start in one call
client.stop_load_scenarios('rate-soak')
health := mockserver.Request().Method("GET").Path("/health").Build()
ramp := mockserver.RampRateStage(10, 200, 30000, mockserver.RampExponential)
maxVus := 40
ramp.MaxVus = &maxVus
scenario := mockserver.LoadScenario{
Name: "rate-soak",
Profile: &mockserver.LoadProfile{
Stages: []mockserver.LoadStage{
mockserver.ConstantVusStage(2, 10000),
mockserver.PauseStage(5000),
ramp,
mockserver.ConstantRateStage(200, 60000),
},
},
Steps: []mockserver.LoadStep{ {Request: &health} },
}
client.RunLoadScenario(scenario) // register + start in one call
client.StopLoadScenarios("rate-soak")
var scenario = new LoadScenario
{
Name = "rate-soak",
Profile = new LoadProfile
{
Stages =
{
LoadStage.ConstantVus(2, 10000),
LoadStage.Pause(5000),
new LoadStage { Type = LoadStageType.RATE, StartRate = 10, EndRate = 200,
DurationMillis = 30000, Curve = RampCurve.EXPONENTIAL, MaxVus = 40 },
LoadStage.ConstantRate(200, 60000)
}
},
Steps = new List<LoadStep>
{
new() { Request = HttpRequest.Request().WithMethod("GET").WithPath("/health") }
}
};
await client.RunLoadScenarioAsync(scenario); // register + start in one call
await client.StopLoadScenariosAsync("rate-soak");
let profile = LoadProfile::of(vec![
LoadStage::vu_hold(2, 10_000),
LoadStage::pause(5_000),
LoadStage::rate_ramp(10.0, 200.0, 30_000, RampCurve::Exponential).max_vus(40),
LoadStage::rate_hold(200.0, 60_000),
]);
let steps = vec![
LoadStep::new(HttpRequest::new().method("GET").path("/health")),
];
let scenario = LoadScenario::new("rate-soak", profile, steps);
client.run_load_scenario(&scenario).unwrap(); // register + start in one call
client.stop_load_scenarios(&["rate-soak"]).unwrap();
$scenario = LoadScenario::scenario('rate-soak')
->profile(LoadProfile::of(
LoadStage::vuHold(2, 10000),
LoadStage::pause(5000),
LoadStage::rateRamp(10, 200, 30000, 'EXPONENTIAL')->maxVus(40),
LoadStage::rateHold(200, 60000),
))
->addStep(HttpRequest::request()->method('GET')->path('/health'));
$client->runLoadScenario($scenario); // register + start in one call
$client->stopLoadScenarios('rate-soak');
# Register the open-model soak...
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
-H "Content-Type: application/json" \
-d '{
"name": "rate-soak",
"profile": { "stages": [
{ "type": "VU", "vus": 2, "durationMillis": 10000 },
{ "type": "PAUSE", "durationMillis": 5000 },
{ "type": "RATE", "startRate": 10, "endRate": 200, "durationMillis": 30000, "curve": "EXPONENTIAL", "maxVus": 40 },
{ "type": "RATE", "rate": 200, "durationMillis": 60000 }
] },
"steps": [
{ "request": { "method": "GET", "path": "/health" } }
]
}'
# ...then start it (requires loadGenerationEnabled=true)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start -d '{ "name": "rate-soak" }'
Safety Caps
Load generation is bounded by configurable caps to prevent the feature from self-DoS-ing the server. Requests that exceed a cap are rejected with 400 Bad Request and a JSON error message. All caps can be raised via the corresponding configuration property.
| Control | Property | Default | Enforced at |
|---|---|---|---|
| Feature flag (triggering runs) | mockserver.loadGenerationEnabled |
false |
PUT /loadScenario/start returns 403 when off (loading is always allowed) |
| Maximum concurrent scenarios | mockserver.loadGenerationMaxConcurrentScenarios |
10 | Trigger validation — rejected at PUT /loadScenario/start when it would exceed the number of active (PENDING+RUNNING) scenarios |
| Maximum virtual users | mockserver.loadGenerationMaxVirtualUsers |
50 | Scenario validation (rejected at PUT) |
| Maximum arrival rate (iterations/second) | mockserver.loadGenerationMaxRate |
5000 | Scenario validation (rejected at PUT) — applies to RATE stages |
| Maximum stages per profile | mockserver.loadGenerationMaxStages |
20 | Scenario validation (rejected at PUT) |
| Maximum in-flight requests | mockserver.loadGenerationMaxInFlightRequests |
200 | Live in-flight semaphore at dispatch |
| Maximum requests per second | mockserver.loadGenerationMaxRequestsPerSecond |
500 | Live token bucket at dispatch |
| Maximum total duration | mockserver.loadGenerationMaxDurationMillis |
3 600 000 ms (1 hour) | Scenario validation — the sum of all stage durations (rejected at PUT) |
| Maximum steps per scenario | mockserver.loadGenerationMaxSteps |
50 | Scenario validation (rejected at PUT) |
Asserting SLOs over Load Results
Every completed load-scenario request is recorded into MockServer's SLO sample store (when sloTrackingEnabled=true), so you can drive load and then assert that latency and error-rate objectives held — all from the MockServer control plane.
# 1. Enable both features
docker run \
-e MOCKSERVER_LOAD_GENERATION_ENABLED=true \
-e MOCKSERVER_SLO_TRACKING_ENABLED=true \
mockserver/mockserver
# 2. Load (register) then trigger a load scenario (e.g. 10 VUs for 30 s)
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
-H "Content-Type: application/json" \
-d '{
"name": "slo-validation-run",
"profile": { "stages": [ { "type": "VU", "vus": 10, "durationMillis": 30000 } ] },
"steps": [
{
"request": {
"method": "GET",
"path": "/api/health",
"socketAddress": { "host": "target.svc", "port": 8080, "scheme": "HTTP" }
}
}
]
}'
curl -s -X PUT http://localhost:1080/mockserver/loadScenario/start \
-d '{ "name": "slo-validation-run" }'
# 3. Wait for the scenario to complete (poll GET /loadScenario/slo-validation-run until state = "COMPLETED")
# 4. Assert SLOs held during the run
# Returns 200 (PASS) or 406 (FAIL)
curl -s -w "\n%{http_code}" -X PUT http://localhost:1080/mockserver/verifySLO \
-H "Content-Type: application/json" \
-d '{
"name": "checkout-slo",
"window": { "type": "LOOKBACK", "lookbackMillis": 60000 },
"minimumSampleCount": 50,
"upstreamHosts": ["target.svc"],
"objectives": [
{ "sli": "LATENCY_P95", "comparator": "LESS_THAN", "threshold": 200.0 },
{ "sli": "ERROR_RATE", "comparator": "LESS_THAN_OR_EQUAL", "threshold": 0.01 }
]
}'
See SLO Resilience Verdicts for the full verifySLO reference, including window types, all SLI options, and the response body schema.
Scope note: the SLO sample store records load-scenario traffic and real proxied traffic under the same FORWARD scope, keyed by target host. When asserting SLOs over a load run, use a narrow window that covers only the load period to avoid mixing in unrelated traffic.
Observability & Metrics
Every completed load-scenario dispatch is recorded into a dedicated mock_server_load_* Prometheus metric family and mirrored over OpenTelemetry (OTLP). This lets you chart the injector's latency, throughput, and error rate alongside your system-under-test in Grafana, Datadog, or any OTEL-compatible backend — without a separate load tool.
All per-request metrics carry six fixed labels so you can slice and dice without extra queries:
| Label | Value |
|---|---|
scenario |
The scenario name |
run_id |
A UUID generated at scenario start — stable for one run, resets on each PUT. Use it to filter metrics to exactly one execution. |
step |
Step index (0-based) or the step name when set |
route |
Auto-templatized path (numeric and UUID segments become {id}, e.g. /api/orders/{id}) or the step name when set |
method |
HTTP method (GET, POST, …) |
status_class |
Response status class: 2xx, 3xx, 4xx, 5xx, or unknown |
The full metric catalogue:
| Metric | Type | Description |
|---|---|---|
mock_server_load_request_duration_seconds |
Histogram | Round-trip latency per dispatch. Query any percentile with histogram_quantile in Prometheus or the equivalent in your OTEL backend. |
mock_server_load_requests |
Counter | Completed dispatches |
mock_server_load_request_bytes |
Counter | Outbound request bytes |
mock_server_load_response_bytes |
Counter | Inbound response bytes |
mock_server_load_iterations |
Counter | Full VU iteration completions (labelled by scenario + run_id only) |
mock_server_load_throttled |
Counter | Dispatches skipped by the self-load guard. Label reason = inflight_cap or rate_limit. A rising value means the scenario could not reach its setpoint. |
mock_server_load_errors |
Counter | Failed dispatches. Label kind = render, connection, timeout, null_response, or http_5xx. |
mock_server_load_active_vus |
Gauge | Virtual users currently running |
mock_server_load_inflight_requests |
Gauge | Dispatches currently in flight |
Adding custom labels
Attach domain dimensions (environment, region, team) to metric series using the labels field on the scenario or step:
# Enable load generation with a custom label allowlist (Prometheus requires this at startup)
docker run \
-e MOCKSERVER_LOAD_GENERATION_ENABLED=true \
-e MOCKSERVER_LOAD_GENERATION_METRIC_LABELS=env,region \
mockserver/mockserver
# Scenario with custom labels
curl -s -X PUT http://localhost:1080/mockserver/loadScenario \
-H "Content-Type: application/json" \
-d '{
"name": "checkout-load",
"labels": { "env": "staging", "region": "eu-west-1" },
"profile": { "stages": [ { "type": "VU", "vus": 10, "durationMillis": 30000 } ] },
"steps": [
{
"name": "get-order",
"labels": { "team": "orders" },
"request": { "method": "GET", "path": "/api/orders/$iteration.index",
"socketAddress": { "host": "orders.svc", "port": 8080 } }
}
]
}'
Prometheus: Only label keys listed in mockserver.loadGenerationMetricLabels (set at startup) appear as Prometheus labels. This is a Prometheus requirement — the schema is fixed at registration time.
OpenTelemetry: All custom label keys are forwarded as OTEL attributes with no allowlist needed.
Exemplars / trace pivoting
When your system under test propagates W3C Trace Context, the load-scenario latency histogram (mock_server_load_request_duration_seconds) attaches the upstream trace_id from the response's traceparent header as a Prometheus exemplar. In Grafana, this lets you click a latency spike and jump directly to the trace that caused it.
Example PromQL
# p95 request latency for a specific run
histogram_quantile(0.95,
sum by (le, scenario, run_id) (
rate(mock_server_load_request_duration_seconds_bucket[1m])
)
)
# Error rate by kind
sum by (kind) (rate(mock_server_load_errors_total[1m]))
# Is the scenario being throttled?
rate(mock_server_load_throttled_total[1m])
Metrics require metricsEnabled=true. See Observability for the full Prometheus and OTEL configuration reference.
Configuration
| Property | Environment variable | Default | Description |
|---|---|---|---|
mockserver.loadGenerationEnabled |
MOCKSERVER_LOAD_GENERATION_ENABLED |
false |
Master switch. Must be true for PUT /mockserver/loadScenario to succeed. Set at startup to avoid a restart. |
mockserver.loadGenerationMaxVirtualUsers |
MOCKSERVER_LOAD_GENERATION_MAX_VIRTUAL_USERS |
50 | Maximum concurrent virtual users allowed in a scenario. Raise for higher-concurrency load runs. |
mockserver.loadGenerationMaxInFlightRequests |
MOCKSERVER_LOAD_GENERATION_MAX_IN_FLIGHT_REQUESTS |
200 | Maximum dispatches allowed in flight simultaneously. Acts as a semaphore — dispatches that would exceed this are counted in mock_server_load_throttled with reason inflight_cap. |
mockserver.loadGenerationMaxRequestsPerSecond |
MOCKSERVER_LOAD_GENERATION_MAX_REQUESTS_PER_SECOND |
500 | Maximum dispatches per second (token bucket). Dispatches that would exceed this are counted in mock_server_load_throttled with reason rate_limit. |
mockserver.loadGenerationMaxDurationMillis |
MOCKSERVER_LOAD_GENERATION_MAX_DURATION_MILLIS |
3 600 000 (1 hour) | Maximum scenario duration in milliseconds. Scenarios with a longer duration are rejected at PUT. |
mockserver.loadGenerationMaxSteps |
MOCKSERVER_LOAD_GENERATION_MAX_STEPS |
50 | Maximum number of steps per scenario. Scenarios with more steps are rejected at PUT. |
mockserver.loadGenerationMetricLabels |
MOCKSERVER_LOAD_GENERATION_METRIC_LABELS |
"" |
Comma-separated list of custom label keys to register in Prometheus (e.g. env,region,team). Must be set before the first scenario runs. OpenTelemetry always receives all custom labels regardless of this setting. |
Related Pages
- Chaos Testing & Fault Injection — inject faults alongside load to verify resilience under stress
- SLO Resilience Verdicts — assert that latency and error-rate objectives held during the load run
- Response Templates — the Velocity / Mustache template engines used for
pathandbodyrendering in steps - Observability — Prometheus and OpenTelemetry configuration; the
mock_server_load_*family feeds into the same pipeline - Observability & Metrics — load-specific metrics, custom labels, exemplars, and PromQL examples (this page)