Chaos Testing & Fault Injection
MockServer's chaos testing feature lets you inject realistic faults into HTTP responses — both mocked responses and forwarded/proxied upstream responses — so you can verify that your application handles errors, latency, and outages gracefully.
Attach a declarative chaos profile to any expectation to control:
- Connection drop injection — drop the TCP connection without sending any response, simulating a hard network failure or connection reset
- Error injection — return an HTTP error status (e.g. 500, 503, 429) instead of the normal response, with an optional
Retry-Afterheader - Latency injection — add artificial delay to responses to test timeout handling, deadline propagation, and slow-response UX
- Probabilistic or deterministic — control what fraction of requests are affected, from "every request" (
1.0) to "10% of requests" (0.1) - Reproducible results — set a
seedto make probabilistic outcomes identical across test runs
Because chaos profiles work on both mocked and forwarded responses, MockServer can act as a chaos proxy — sit it in front of a real service (upstream, third-party API, or internal dependency) and inject faults into the responses it relays. This makes it a powerful tool for SRE resilience testing, not just unit/integration test mocking.
Chaos Profile Fields
A chaos profile is a JSON object (or HttpChaosProfile in the Java client) with the following fields. All fields are optional — omit any you don't need.
| Field | Type | Description | Valid range |
|---|---|---|---|
errorStatus |
integer | The HTTP status code to return when an error is injected (e.g. 500, 503, 429). When an error is injected the response body is a JSON object: {"error":{"type":"chaos_injected","message":"injected HTTP chaos error"}}. |
100 – 599 |
errorProbability |
number | The probability (0.0 to 1.0) that a matched request will receive the error instead of the normal response. 0.0 or omitted means errors are never injected; 1.0 means every request gets the error (deterministic). Fractional values (e.g. 0.3) inject errors on approximately 30% of requests. |
0.0 – 1.0 |
dropConnectionProbability |
number | The probability (0.0 to 1.0) that a matched request will have its TCP connection dropped without any response being sent, simulating a hard connection failure or network blip. When a connection drop fires it takes priority over error and latency injection. Uses a derived seed for independent but reproducible draws. | 0.0 – 1.0 |
retryAfter |
string | Value for the Retry-After HTTP header on injected error responses. Typically a number of seconds (e.g. "30") or an HTTP-date. Only included when an error is actually injected. |
any string (max 100 chars) |
latency |
object | Artificial delay added to every matched response (both normal and error responses). Specified as a Delay object with timeUnit (e.g. MILLISECONDS, SECONDS) and value. Latency is applied in addition to any delay on the action itself and the global response delay. |
valid Delay object |
seed |
integer | A fixed seed for the random number generator used by errorProbability. When set, the same seed + probability always yields the same inject/skip outcome, making tests reproducible. Note: a fixed seed produces the same decision on every request (always inject or always skip for a given probability), so it is most useful for making a known-fractional probability deterministic in a specific test. |
any long integer |
succeedFirst |
integer | The first N matching requests bypass chaos (normal response). Requests 1..N succeed; chaos becomes eligible from request N+1. Combine with failRequestCount for a finite fault window. |
≥ 0 (default: omitted = 0) |
failRequestCount |
integer | After the succeedFirst window, the next M matching requests receive chaos; after succeedFirst + M matches the expectation recovers (normal responses). Omit for unlimited faults after the succeed window. |
≥ 1 (default: omitted = unlimited) |
outageAfterMillis |
integer | Time-based outage window: chaos becomes active this many milliseconds after the expectation's first matched request. Before this point requests behave normally. Combine with outageDurationMillis for a self-healing outage. |
≥ 0 (default: omitted = 0) |
outageDurationMillis |
integer | Time-based outage window: once the outage has started, chaos stays active for this many milliseconds, then the expectation self-heals and serves normal responses again. Omit for an outage that never ends. | ≥ 1 (default: omitted = unbounded) |
truncateBodyAtFraction |
number | Corrupt the response body by keeping only this leading fraction of its bytes (e.g. 0.5 keeps the first half, 0.0 empties the body). Tests how a client copes with a partial / cut-off payload. Applies to the real (non-error) response only and is skipped for streaming bodies. |
0.0 – 1.0 (default: omitted = no truncation) |
malformedBody |
boolean | Corrupt the response body by appending a broken-JSON fragment so it fails to parse. Tests client-side body-parsing resilience. Applies to the real (non-error) response only and is skipped for streaming bodies. | true / false (default: omitted = false) |
slowResponseChunkSize |
integer | Dribble the response body in chunks of this many bytes (chunked transfer-encoding). Combine with slowResponseChunkDelay to trickle the body slowly and test read timeouts. Applies to the real (non-error) response only and is skipped for streaming bodies. |
≥ 1 (default: omitted = no dribble) |
slowResponseChunkDelay |
delay | The delay between dribbled chunks (a { "timeUnit": ..., "value": ... } object). Required alongside slowResponseChunkSize for the slow response to take effect. |
(default: omitted = no dribble) |
quotaName |
string | Stateful rate-limit counter key. Expectations sharing the same quotaName share one counter (model an upstream account limit). Required (with quotaLimit and quotaWindowMillis) to enable the quota. |
(default: omitted = no quota) |
quotaLimit |
integer | Maximum number of requests allowed within the window before requests are rejected. | ≥ 1 (default: omitted = no quota) |
quotaWindowMillis |
integer | Fixed-window length in milliseconds. The first request starts the window; it resets after this duration elapses. | ≥ 1 (default: omitted = no quota) |
quotaErrorStatus |
integer | The HTTP status returned when the quota is exceeded. | 100–599 (default: 429) |
degradationRampMillis |
integer | Gradual degradation: ramp errorProbability and dropConnectionProbability linearly from 0 up to their configured values over this many milliseconds from the expectation's first match, modelling a dependency that deteriorates over time. Measured with the controllable clock. |
≥ 1 (default: omitted = no ramp) |
graphqlErrors |
boolean | Rewrite the response as a GraphQL error envelope: HTTP 200 with a JSON body of the form {"data":null,"errors":[{"message":"...","extensions":{"code":"..."}}]}, Content-Type: application/json, and Content-Length stripped. Takes precedence over truncateBodyAtFraction and malformedBody. Metered as fault_type=graphql. |
true / false (default: omitted = false) |
graphqlErrorMessage |
string | The message placed in errors[0].message. Only used when graphqlErrors is true. Defaults to "simulated GraphQL error" when omitted. |
any string (default: "simulated GraphQL error") |
graphqlErrorCode |
string | Optional value placed in errors[0].extensions.code (for example "INTERNAL_SERVER_ERROR" or "UNAUTHENTICATED"). The extensions object is omitted entirely when this field is not set. |
any string (default: omitted = no extensions) |
graphqlNullifyData |
boolean | Controls the data field in the GraphQL error envelope. When true (the default), data is null — a full error. When false, MockServer tries to parse the original response body as JSON and embed it as the data value, simulating a partial success (data plus errors). Falls back to data: null if the original body is not valid JSON. |
true / false (default: omitted = true, i.e. data:null) |
Outage Windows (time-based)
The outageAfterMillis and outageDurationMillis fields define a
self-healing outage window measured relative to the expectation's first matched request: chaos is active only from
outageAfterMillis until outageAfterMillis +
outageDurationMillis have elapsed, after which the service recovers automatically. This models a
dependency that is healthy for a while, degrades for a bounded period, then comes back — ideal for testing how a service
behaves across a transient downstream outage. The time window composes with the count window and the probability fields: a
fault fires only when the request falls inside the time window and the count window and the probability draw
passes.
Outage windows are measured with MockServer's controllable clock, so tests do not have to wait in real time. Freeze and
advance the clock with the PUT /mockserver/clock
control-plane endpoint to step deterministically from "before the outage" to "during the outage" to "recovered" without any
sleep calls.
Where Chaos Applies
Chaos profiles apply to most expectation action types:
| Action type | Chaos supported |
|---|---|
Mocked response (httpResponse) |
Yes |
Response template (httpResponseTemplate) |
Yes |
| Response class callback | Yes |
| Response object callback | Not yet (uses its own write path) |
Forward (httpForward) |
Yes |
| Forward template | Yes |
| Forward class callback | Yes |
Forward with override (httpOverrideForwardedRequest) |
Yes |
| Forward with validation | Yes |
| Forward object callback | Not yet (uses its own write path) |
| Unmatched proxy pass-through | Not yet |
Error (httpError) |
Not applicable (already a fault action) |
Connection Drop Injection
Connection drop injection closes the TCP connection without sending any response, simulating a hard network failure or connection reset. This is the most severe fault type and takes priority over error and latency injection when multiple fault types are configured.
How it works: on each matched request, MockServer first draws against dropConnectionProbability. If the draw says "drop," the TCP connection is closed immediately with no response written. When the draw says "skip," the chaos profile falls through to error injection (if configured) and then latency injection. This ordering ensures that connection drops, errors, and latency are evaluated as independent faults with a clear priority: drop > error > latency.
Common use cases:
- Test connection timeout handling: verify that your HTTP client detects and handles a dropped connection (e.g. retries, circuit-breaker tripping)
- Test network blip resilience: inject connection drops at a fractional probability to simulate intermittent network instability
- Test proxy/load balancer failover: simulate a backend disappearing mid-request to verify failover behaviour
Error Injection
Error injection replaces the normal response with a synthetic HTTP error. This is the primary mechanism for simulating downstream service failures.
How it works: on each matched request, MockServer draws against errorProbability. If the draw says "inject," the configured errorStatus is returned instead of the real response. The response body is a JSON error object and, if retryAfter is set, a Retry-After header is included. When the draw says "skip," the normal response (or forwarded upstream response) is returned unchanged.
Common use cases:
- Test retry logic: inject 503 errors at 30% probability and verify your client retries with exponential backoff
- Test rate-limit handling: inject 429 errors at 100% probability with a
Retry-Afterheader and verify your client respects the wait period - Simulate outages: inject 503 at 100% probability to verify circuit-breaker tripping and fallback behaviour
- Test error pages and user experience: inject 500 errors and verify your UI displays a useful error message
Latency Injection
Latency injection adds artificial delay to the response without changing its content or status code. This is useful for testing how your application handles slow dependencies.
How it works: the latency delay is applied to every matched response — whether or not an error is also injected. It is added on top of any delay configured on the action itself and any global response delay (mockserver.globalResponseDelayMillis).
Common use cases:
- Test timeout handling: inject 5-second latency and verify your client times out gracefully
- Test deadline propagation: inject latency into one service in a call chain and verify deadlines propagate correctly
- Test loading states: inject 2-second latency and verify your UI shows a loading indicator
- Test slow upstream in proxy mode: forward requests to a real service with added latency to see how your application copes with a sluggish dependency
For more complex latency patterns (uniform distribution, log-normal, gaussian), use the delay field on the response action itself. See Creating Expectations and Scalability & Latency for details.
Body Corruption
Body corruption damages the payload of an otherwise-successful response so you can test how robustly your client parses what it receives — independently of the status code. Two fields are available and can be combined:
truncateBodyAtFraction— keeps only a leading fraction of the body bytes (for example0.5returns the first half of the body,0.0returns an empty body). Simulates a connection that delivered a partial / cut-off payload.malformedBody— appends a broken-JSON fragment to the body so a JSON parser fails. Simulates a corrupted or truncated-then-garbled payload.
How it works: body corruption is deterministic — it is not subject to a probability draw. It applies to the real (mocked or forwarded) response whenever the request is inside the active count window and time-based outage window. When both fields are set, the body is truncated first and the malformed fragment is then appended. To keep the response well-framed, MockServer removes any stale Content-Length header so the response encoder sets the correct length for the corrupted body, and preserves the original Content-Type.
Priority: connection-drop and error injection take precedence — when an error status is injected, its synthetic error body is returned uncorrupted. Body corruption only affects the real response that would otherwise have been returned. Streaming response bodies are not corrupted (the LLM response path has its own mid-stream truncation).
Common use cases:
- Test JSON-parsing resilience: set
malformedBody: trueand verify your client surfaces a clear parse error rather than crashing. - Test partial-response handling: set
truncateBodyAtFraction: 0.5and verify your client detects the short / incomplete payload. - Test a flaky upstream in proxy mode: attach body corruption to a forward action to see how your application copes with a dependency returning damaged payloads.
GraphQL Error Injection
GraphQL APIs always return HTTP 200, even for errors — error details are carried in a JSON errors array in the response body. Standard HTTP error injection (which changes the status code to 500 or 503) does not reproduce this pattern. The graphqlErrors flag rewrites the response into a spec-compliant GraphQL error envelope so your GraphQL client's error-handling logic is exercised correctly.
How it works: when graphqlErrors: true is set, MockServer replaces the response body with a JSON envelope of the form:
{"data":null,"errors":[{"message":"simulated GraphQL error","extensions":{"code":"INTERNAL_SERVER_ERROR"}}]}
The response is sent as HTTP 200 with Content-Type: application/json. Any stale Content-Length header is removed. Set graphqlNullifyData: false to embed the original response body JSON as the data value instead — this simulates a partial success where the server returns some data alongside errors (a common pattern in GraphQL APIs that use partial responses).
Priority: graphqlErrors takes precedence over truncateBodyAtFraction and malformedBody — when GraphQL injection is active, body corruption is skipped because the envelope is the intended body. The slow-response dribble (slowResponseChunkSize + slowResponseChunkDelay) still applies to trickle the envelope. Like body corruption, GraphQL injection is deterministic (no probability draw) and respects the count window (succeedFirst / failRequestCount). Error and connection-drop injection (which change the HTTP status) still take priority over everything — GraphQL injection only fires on the real, non-error response path.
Works with service-scoped chaos too: graphqlErrors can be set on a service-scoped profile (PUT /mockserver/serviceChaos), making it easy to inject GraphQL errors into all matching forwards to a GraphQL upstream without touching individual expectations.
// Simulate a full GraphQL error on all calls to a GraphQL service
PUT /mockserver/serviceChaos
{
"host": "graphql.api.svc",
"chaos": {
"graphqlErrors": true,
"graphqlErrorMessage": "upstream database unavailable",
"graphqlErrorCode": "INTERNAL_SERVER_ERROR"
}
}
// Simulate a partial success (data present alongside errors)
PUT /mockserver/serviceChaos
{
"host": "graphql.api.svc",
"chaos": {
"graphqlErrors": true,
"graphqlNullifyData": false,
"graphqlErrorMessage": "partial result: one field failed",
"graphqlErrorCode": "DOWNSTREAM_ERROR"
}
}
Common use cases:
- Test GraphQL error handling: verify that your client reads
errors[0].messageand surfaces it to the user instead of silently treating the HTTP 200 as a success. - Test per-error-code branching: set
graphqlErrorCodeto a code your client acts on (e.g."UNAUTHENTICATED") and verify the client triggers a re-authentication flow. - Test partial-success handling: set
graphqlNullifyData: falseand verify your client renders the partial data while also surfacing the error.
Slow (Dribbled) Response
A slow response trickles the response body to the client in small chunks with a delay between each, instead of sending it all at once. This is useful for testing read timeouts and slow-network behaviour — distinct from latency (which delays the whole response by a fixed amount before sending it).
How it works: set slowResponseChunkSize (bytes per chunk) and slowResponseChunkDelay (the delay between chunks). MockServer then sends the body using chunked transfer-encoding, writing one chunk at a time with the configured delay between them. Both fields are required — a chunk size with no delay simply chunks the body without slowing it down. Like body corruption, the slow response is deterministic, applies to the real (mocked or forwarded) response within the active count and outage windows, and is skipped for streaming bodies. It is metered as fault_type=slow.
Common use cases:
- Test read timeouts: dribble a body in 1-byte chunks with a 1-second delay and verify your client's socket read timeout fires.
- Test slow-network handling: trickle a large payload to verify progress indicators and partial-read handling.
Stateful Request Quota (rate limit)
The request quota is a deterministic, stateful fixed-window rate limit — the counterpart to the probabilistic 429 (which fires randomly per request). It lets you drive an application into a hard rate limit: "the 5th call within 60 seconds gets 429."
How it works: set quotaName, quotaLimit and quotaWindowMillis. MockServer counts matched requests against the named quota; once the count exceeds quotaLimit within the current window, further requests are rejected with quotaErrorStatus (default 429) and the retryAfter header. The window is fixed: the first request starts it and it resets quotaWindowMillis later. Expectations that share a quotaName share one counter, so you can model a single upstream account limit spread across several mocks. The counter is process-wide and is cleared on server reset.
The quota gate takes priority over the probabilistic error and the body/slow faults (it is evaluated right after the connection-drop check), so a rate-limited request always returns the quota status. A misconfigured quota (any of the three fields missing) is ignored.
When the quota is combined with a count window (succeedFirst / failRequestCount), only requests inside that window count against the quota — requests in the "succeed" or "recovered" phases are not counted. Most setups use the quota on its own, where every matched request counts.
Service-scoped Chaos
Instead of attaching a chaos block to every forwarding expectation, you can register one chaos profile for an entire upstream host and have it applied to all matched forwards to that host — the ergonomic "break service X" control. This is useful when running MockServer as a chaos proxy in front of one or more upstreams.
Register, read and clear service-scoped chaos through a control-plane endpoint (protected by control-plane authentication when configured):
// register a profile for a host (replaces any existing one for that host)
PUT /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "errorStatus": 503, "errorProbability": 0.3, "latency": { "timeUnit": "MILLISECONDS", "value": 500 } } }
// one call = "break payments.svc for 5 minutes, then auto-heal" (time-boxed chaos)
PUT /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "errorStatus": 503, "errorProbability": 0.3 }, "ttlMillis": 300000 }
// remove the profile for a host
PUT /mockserver/serviceChaos
{ "host": "payments.svc", "remove": true }
// clear all service-scoped chaos
PUT /mockserver/serviceChaos
{ "clear": true }
// read the current host -> profile registrations
GET /mockserver/serviceChaos
How it works: on a matched forward expectation that has no chaos of its own, MockServer looks up the profile registered for the request's Host header (matched case-insensitively, ignoring any :port) and applies it to the forwarded response. An expectation that defines its own chaos always takes precedence. The anonymous, unmatched proxy fall-through path is not affected, and the registrations are cleared on server reset. Because a service-scoped profile has no single owning expectation, the per-expectation count window, outage window and degradation ramp are not anchored — service-scoped profiles are best used for the steady-state faults (errors, drops, latency, body corruption, slow response, and the host-independent quota).
Time-to-live (auto-revert): add an optional ttlMillis to a registration and the chaos automatically reverts that many milliseconds later — a "dead-man's switch" so the fault self-heals even if the matching clear is never sent (for example, an external chaos orchestrator crashes mid-experiment). It is the time-boxed one-shot form: a single call breaks a host for a bounded window. Expiry is measured with the controllable clock, so it tracks real time by default but is deterministic under PUT /mockserver/clock freeze/advance. Without ttlMillis a registration persists until explicitly cleared or the server is reset. GET /mockserver/serviceChaos reports a ttlRemainingMillis map alongside the active profiles, so you can see the countdown for each time-boxed registration.
Besides the REST endpoint, convenience wrappers are available in the client libraries — Java (mockServerClient.setServiceChaos(host, chaos) / removeServiceChaos(host) / clearServiceChaos() / serviceChaosStatus()), Node (setServiceChaos / removeServiceChaos / clearServiceChaos / serviceChaosStatus), Python (set_service_chaos / remove_service_chaos / clear_service_chaos / service_chaos_status), and Ruby (the same snake-case names) — and via the manage_service_chaos MCP tool (action of register / remove / clear) for AI assistants.
The dashboard UI also has a Chaos tab for managing service-scoped chaos interactively: register a host with an error status / error probability / drop probability / latency (and an optional TTL), see every active registration with a summary of its faults, watch the live TTL auto-revert countdown, and remove a single host or clear them all.
Because the control plane is a single HTTP call with a built-in TTL safety timer, it is also the integration point for external chaos orchestrators — register the fault at the start of an experiment, verify the application copes, and let the TTL revert it even if the orchestrator never sends the clear. See Driving MockServer from Chaos Orchestrators for Chaos Toolkit, AWS FIS, Azure Chaos Studio and LitmusChaos recipes.
Live Tuning of Service Chaos (PATCH)
Once a service-scoped chaos profile is registered, you can update individual fields without replacing the whole profile and without restarting MockServer. Use PATCH /mockserver/serviceChaos to apply JSON Merge Patch semantics: only non-null fields in the request body are updated; all other fields and the current TTL are preserved.
This is useful when you want to adjust a live experiment's fault rates mid-run — for example, ramp error probability up or down while the application is running — without having to recreate the profile or lose the TTL countdown on an active time-boxed registration.
// Increase the error probability on an already-active profile without touching latency or TTL
PATCH /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "errorProbability": 0.9 } }
// Switch from error injection to connection-drop injection mid-experiment
PATCH /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "errorProbability": 0.0, "dropConnectionProbability": 0.5 } }
// Add latency to an existing error-injection profile
PATCH /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "latency": { "timeUnit": "MILLISECONDS", "value": 2000 } } }
Semantics: the PATCH request requires a host field and a chaos object containing at least one field to update. Only the fields you supply are changed — unspecified fields in the existing profile are left unchanged. If no profile exists yet for the host, the partial is registered as a new profile with no TTL. The TTL of an existing timed registration is always preserved; PATCH cannot change or extend it. To replace a profile entirely, use PUT /mockserver/serviceChaos.
The response body echoes the host and the resulting merged chaos profile:
{ "status": "patched", "host": "payments.svc", "chaos": { "errorStatus": 503, "errorProbability": 0.9, "latency": { ... } } }
Gradual Degradation
Gradual degradation models a dependency that deteriorates over time rather than failing all at once — useful for testing alerting thresholds and SLO error-budget burn.
How it works: set degradationRampMillis. The probabilistic fault rates — errorProbability and dropConnectionProbability — are scaled by a factor that climbs linearly from 0.0 at the expectation's first matched request to 1.0 once the ramp duration has elapsed, then stays at full strength. So an expectation with errorProbability: 1.0 and degradationRampMillis: 600000 injects no errors at first, ~50% of requests at the 5-minute mark, and 100% after 10 minutes. The ramp is measured with MockServer's controllable clock, so it is deterministic under clock freeze/advance (PUT /mockserver/clock) with no real-time waiting. Only the probabilistic rates ramp — the deterministic faults (latency, body corruption, slow response, quota) are unaffected.
Reproducibility with Seed
When errorProbability is fractional (not 0.0 or 1.0), the inject/skip decision is random by default. Set the seed field to a fixed value to make this decision deterministic and reproducible across test runs.
Determinism rules:
errorProbabilityof0.0(or omitted) — errors are never injected, regardless of seederrorProbabilityof1.0— errors are always injected, regardless of seed- Fractional values — the decision depends on a single random draw; with a fixed
seed, the same draw is made every time (same result on every request)
Note: a fixed seed with a fractional probability yields the same decision on every request (always inject or always skip), because the seed resets the random state for each evaluation. This is by design — it ensures test reproducibility. For probabilistic variation within a single test run, omit the seed.
Chaos Proxy: Injecting Faults into Real Services
One of the most powerful uses of chaos profiles is on forwarded/proxied responses. Instead of mocking a service, you can forward requests to the real upstream and inject faults into the responses MockServer relays back to the caller.
This turns MockServer into a chaos proxy — sit it between your application and a dependency (internal service or external API) and test what happens when that dependency becomes unreliable.
Deployment patterns:
- Forward proxy: configure your application's
HTTP_PROXY/HTTPS_PROXYenvironment variables to point at MockServer, then create forward expectations with chaos profiles for specific hosts/paths - Reverse proxy: put MockServer in front of a specific upstream service (using MockServer's
proxyRemoteHost/proxyRemotePortconfiguration) and add chaos profiles to inject faults - Kubernetes sidecar or egress proxy: deploy MockServer as a sidecar container or egress proxy in your K8s pod/namespace and route specific traffic through it with chaos injection enabled
MockServer operates at the HTTP layer (L7) and requires either explicit routing or transparent interception (Linux iptables REDIRECT / TPROXY) to place it in-path — see Service Mesh / Sidecar Mode. See Isolating Single Service for detailed proxy deployment patterns.
Stateful / Count-Based Faults
The succeedFirst and failRequestCount fields let you define a window of request numbers where chaos is active. This enables deterministic, count-based fault patterns without writing custom test logic.
How the window works: MockServer tracks the 1-based match count for each expectation. For every matched request, the chaos profile checks whether the current match count falls within the eligible window:
- Requests 1 .. succeedFirst — normal response (chaos is suppressed)
- Requests (succeedFirst + 1) .. (succeedFirst + failRequestCount) — chaos is eligible (subject to
errorProbability) - Requests beyond succeedFirst + failRequestCount — normal response (recovery)
When both fields are omitted, every request is eligible for chaos (backward compatible with the original probabilistic-only behaviour).
Canonical patterns:
- Fail first N then recover (retry/backoff testing) —
succeedFirst: 0,failRequestCount: N. The first N requests receive the chaos error; subsequent requests succeed normally. - Succeed first N then fail (delayed failure testing) —
succeedFirst: N,failRequestCountomitted. Requests 1..N succeed; every request after N receives the chaos error. - Fail only the Nth request (single-fault testing) —
succeedFirst: N-1,failRequestCount: 1. Only request N gets the error; requests before and after succeed normally.
The count window composes with errorProbability: a request must be within the window and pass the probability check to receive a fault. Latency injection follows the same window — chaos latency is only applied to requests within the eligible window.
Interaction with Other Features
- percentage (match probability) — the expectation's
percentagefield controls how often a request matches the expectation;errorProbabilitycontrols how often a matched request gets an error. These compose multiplicatively: an expectation withpercentage: 50anderrorProbability: 0.5injects errors on roughly 25% of structurally matching requests. - Global response delay —
mockserver.globalResponseDelayMillisis added on top of any chaos latency. - Action delay — a delay on the response or forward action itself is combined with chaos latency.
- maxSocketTimeout — latency that exceeds
maxSocketTimeoutwill be cut off by the socket timeout.
Observability
When metrics are enabled, MockServer exposes Prometheus metrics for chaos: a counter that tracks every fault injected, and a gauge for the number of hosts with currently-active service-scoped chaos:
| Metric | Type | Labels | Description |
|---|---|---|---|
mock_server_http_chaos_injected_total |
Counter | fault_type = drop | error | latency | truncate | malformed | slow | quota | graphql |
Cumulative count of HTTP chaos faults injected, split by fault type |
mock_server_active_service_chaos |
Gauge | fault_type = drop | error | latency | truncate | malformed | slow | quota | graphql |
Number of currently-active service-scoped chaos profiles configured with each fault type (a profile with several faults counts under each); drops to 0 as profiles are cleared or their TTLs lapse |
Both metrics are also surfaced in the dashboard UI Metrics view as an "HTTP Chaos Faults" section — a stat per fault type the server emits (drop, error, latency, truncate, malformed, slow, quota, graphql), a per-fault-type chart of cumulative injections, and a per-fault-type chart of the active service-scoped chaos gauge (visible only when a chaos metric has non-zero data).
Both metrics are also mirrored over OpenTelemetry OTLP when OTLP metrics export is enabled, so OTLP-only consumers can observe them without a Prometheus scrape. Note: over OTLP the mock_server_active_service_chaos gauge mirrors every fault type, but the mock_server_http_chaos_injected_total counter currently mirrors only the drop, error and latency fault types — the full set of fault types is available on the Prometheus endpoint.
These metrics currently cover HTTP chaos only; TCP-layer chaos and gRPC chaos are not yet metered, so faults injected by those features do not appear in the chaos counter or gauge.
Example PromQL queries:
# Rate of error faults injected over the last 5 minutes
rate(mock_server_http_chaos_injected_total{fault_type="error"}[5m])
# Total latency faults injected
mock_server_http_chaos_injected_total{fault_type="latency"}
# Alert while any service-scoped chaos is still live (across all fault types)
sum(mock_server_active_service_chaos) > 0
# Active service-scoped chaos injecting connection drops
mock_server_active_service_chaos{fault_type="drop"}
Dashboard UI
Chaos profiles can be configured directly from the MockServer dashboard when composing a standard HTTP expectation. Toggle the Inject fault / chaos switch in the expectation composer to reveal fields for the chaos profile. The dashboard generates the correct JSON payload (with chaos as a top-level expectation field) and the equivalent Java client and curl snippets. Active expectations that have a chaos profile display a Chaos summary chip in the expectations panel.
gRPC Fault Injection
MockServer can inject gRPC-level faults — error status codes, latency, and rate-limit exhaustion — on matched gRPC RPC calls. This is distinct from the gRPC health-check chaos (which controls the grpc.health.v1.Health/Check serving-status response): gRPC fault injection fires before normal gRPC request conversion in GrpcToHttpRequestHandler and applies to any RPC method on any loaded service, not just health probes.
Register one chaos profile per gRPC service name. The profile is applied to every matched call to that service. An empty string ("") registers a default profile that applies to all services without a more-specific override.
Why use this: gRPC clients have their own retry, deadline, and circuit-breaker logic that is separate from HTTP clients. Injecting UNAVAILABLE, DEADLINE_EXCEEDED, or RESOURCE_EXHAUSTED statuses at the RPC layer tests that gRPC-native error handling, backoff, and deadline propagation work correctly — without having to modify the real service.
The gRPC chaos profile covers: status injection, latency, request quota, count windows, trailer manipulation (omitGrpcStatus, corruptGrpcStatus, customTrailers), and client-streaming abort (abortAfterMessages). Drop or truncation of individual stream messages mid-stream is planned for a future release.
gRPC Chaos Profile Fields
A gRPC chaos profile is a JSON object (or GrpcChaosProfile in the Java model) with the following fields. All fields are optional — omit any you don't need.
| Field | Type | Description | Valid values / range |
|---|---|---|---|
errorStatusCode |
string | The gRPC status code name to return when a fault fires. When set, the handler writes an HTTP 200 response with the corresponding grpc-status trailer. Must be one of the 17 canonical gRPC status code names. |
See table below |
errorMessage |
string | The grpc-message trailer value sent alongside the error status code. Optional — omit to send no message. |
any string |
errorProbability |
number | Probability (0.0 to 1.0) that a matched call receives the error instead of normal processing. 0.0 or omitted means no error injection; 1.0 means every call gets the error. |
0.0 – 1.0 |
seed |
integer | Fixed seed for the random number generator used by errorProbability. When set, the same seed + probability always produces the same inject/skip outcome, making tests reproducible. |
any long integer |
latencyMs |
integer | Artificial delay in milliseconds added before the response (whether an error is injected or not). | ≥ 0 |
succeedFirst |
integer | The first N calls to this service bypass fault injection (succeed normally). Combine with failRequestCount for a finite fault window. |
≥ 0 (default: omitted = 0) |
failRequestCount |
integer | After the succeedFirst window, the next M calls receive the fault; after that the service recovers. Omit for unlimited faults after the succeed window. |
≥ 1 (default: omitted = unlimited) |
quotaName |
string | Stateful rate-limit counter key. Required (with quotaLimit and quotaWindowMillis) to enable quota enforcement. Calls over the limit return RESOURCE_EXHAUSTED. |
(default: omitted = no quota) |
quotaLimit |
integer | Maximum number of calls allowed within the window before quota is exceeded. | ≥ 1 (default: omitted = no quota) |
quotaWindowMillis |
integer | Fixed-window length in milliseconds. The first call starts the window; it resets after this duration elapses. Calls beyond quotaLimit within the window return RESOURCE_EXHAUSTED. |
≥ 1 (default: omitted = no quota) |
omitGrpcStatus |
boolean | When true, the fault response is sent with no grpc-status trailer at all. This simulates an incomplete or broken RPC stream — the server starts a response but never terminates it correctly, causing gRPC clients to raise a stream-reset or missing-trailer error. Takes precedence over corruptGrpcStatus. |
true / false (default: omitted = false) |
corruptGrpcStatus |
boolean | When true (and omitGrpcStatus is false), the fault response sets grpc-status to the non-numeric value malformed, which violates the gRPC spec (grpc-status must be a decimal integer). Tests how clients cope with an unparseable status trailer — a genuine protocol violation rather than merely an unrecognised numeric code. |
true / false (default: omitted = false) |
customTrailers |
object | A JSON object of arbitrary trailer key/value pairs injected on the fault response. These are written in addition to (or instead of, when omitGrpcStatus is set) the normal status trailers. Useful for injecting vendor-specific error metadata that your client reads (e.g. x-ratelimit-remaining: 0). |
any string-to-string map (default: omitted = no custom trailers) |
abortAfterMessages |
integer | For client-streaming RPCs: when the number of decoded gRPC messages in the request body reaches this threshold, MockServer immediately responds with ABORTED (or the profile's errorMessage if set). The message count is determined by decoding the 5-byte gRPC length-prefixed frames in the request body. Use this to test how a streaming client handles mid-stream server abort — for example, a server that rejects an oversized batch. |
≥ 1 (default: omitted = no abort) |
Supported gRPC Status Codes
The errorStatusCode field accepts any of the 17 canonical gRPC status code names. The most commonly injected for resilience testing are:
| Code | Name | Typical test scenario |
|---|---|---|
| 4 | DEADLINE_EXCEEDED | Client deadline propagation, timeout handling |
| 8 | RESOURCE_EXHAUSTED | Rate-limit / quota exhaustion handling (also the quota fault code) |
| 13 | INTERNAL | Unexpected server-side failure handling |
| 14 | UNAVAILABLE | Service outage, retry and circuit-breaker testing |
| 16 | UNAUTHENTICATED | Auth-token expiry and re-authentication flows |
| 10 | ABORTED | Optimistic concurrency, transaction conflict handling |
All 17 canonical codes are accepted: OK, CANCELLED, UNKNOWN, INVALID_ARGUMENT, DEADLINE_EXCEEDED, NOT_FOUND, ALREADY_EXISTS, PERMISSION_DENIED, RESOURCE_EXHAUSTED, FAILED_PRECONDITION, ABORTED, OUT_OF_RANGE, UNIMPLEMENTED, INTERNAL, UNAVAILABLE, DATA_LOSS, UNAUTHENTICATED.
gRPC Chaos REST API
// register a fault profile for a specific gRPC service (replaces any existing one)
PUT /mockserver/grpcChaos
{ "service": "com.example.payments.PaymentService", "chaos": { "errorStatusCode": "UNAVAILABLE", "errorProbability": 0.3, "latencyMs": 200 } }
// register with a TTL (auto-revert after 5 minutes)
PUT /mockserver/grpcChaos
{ "service": "com.example.payments.PaymentService", "chaos": { "errorStatusCode": "UNAVAILABLE", "errorProbability": 1.0 }, "ttlMillis": 300000 }
// register a default profile that applies to ALL gRPC services (empty string key)
PUT /mockserver/grpcChaos
{ "service": "", "chaos": { "errorStatusCode": "DEADLINE_EXCEEDED", "errorProbability": 0.2 } }
// remove the profile for a service
PUT /mockserver/grpcChaos
{ "service": "com.example.payments.PaymentService", "remove": true }
// clear all gRPC chaos profiles
PUT /mockserver/grpcChaos
{ "clear": true }
// read all active gRPC chaos profiles and TTL countdowns
GET /mockserver/grpcChaos
// merge-patch — update only the specified fields, preserve the rest and the TTL
PATCH /mockserver/grpcChaos
{ "service": "com.example.payments.PaymentService", "chaos": { "errorProbability": 0.9 } }
// simulate a server that omits the grpc-status trailer (broken stream)
PUT /mockserver/grpcChaos
{ "service": "com.example.orders.OrderService", "chaos": { "omitGrpcStatus": true } }
// simulate a server that sends a non-numeric grpc-status (genuine protocol violation)
PUT /mockserver/grpcChaos
{ "service": "com.example.orders.OrderService", "chaos": { "corruptGrpcStatus": true } }
// inject custom trailers alongside the error (e.g. rate-limit metadata)
PUT /mockserver/grpcChaos
{
"service": "com.example.orders.OrderService",
"chaos": {
"errorStatusCode": "RESOURCE_EXHAUSTED",
"errorProbability": 1.0,
"customTrailers": { "x-ratelimit-remaining": "0", "x-ratelimit-reset": "1748700000" }
}
}
// abort a client-streaming RPC after 5 messages
PUT /mockserver/grpcChaos
{ "service": "com.example.upload.UploadService", "chaos": { "abortAfterMessages": 5 } }
The GET response reports active profiles under a services map (service name → profile) and, when any registrations carry a TTL, a ttlRemainingMillis map alongside it.
The PATCH endpoint applies JSON Merge Patch semantics: only the fields you supply are updated; unspecified fields in the existing profile and the current TTL are left unchanged. If no profile exists yet for the service, the partial is registered as a new profile with no TTL.
Like service-scoped HTTP chaos and TCP chaos, registrations support an optional ttlMillis for automatic expiry and are cleared on server reset.
Relationship to gRPC health-check chaos: the gRPC health-check feature (PUT /mockserver/grpc/health) controls the grpc.health.v1.Health/Check serving-status response — changing it makes a Kubernetes readiness probe fail. gRPC fault injection (this section, PUT /mockserver/grpcChaos) injects status errors into any RPC method on your application services. The two mechanisms are independent.
TCP-Layer Chaos
In addition to HTTP-level chaos (which operates on decoded HTTP requests and responses), MockServer supports TCP-layer chaos that operates on raw bytes before HTTP decoding. This enables transport-layer fault injection that mirrors Toxiproxy's named toxics.
TCP-layer chaos is managed separately from HTTP chaos profiles. It is registered against a host and applied to all connections from that host at the raw byte level.
TCP Chaos Fault Types
| Fault type | Field | Type | Description |
|---|---|---|---|
| latency | latencyMs |
long | Delays all inbound data by the specified milliseconds before it reaches the HTTP decoder |
| down | down |
boolean | Silently drops all inbound data so the service appears completely down |
| bandwidth | bandwidthBytesPerSec |
long | Throttles inbound data to the specified bytes per second |
| slow_close | slowClose |
boolean | Delays the TCP FIN by 2 seconds on close, simulating a slow connection teardown |
| timeout | timeout |
boolean | Never sends FIN on close; the connection hangs indefinitely |
| reset_peer | resetPeer |
boolean | Sends a TCP RST and closes the connection immediately on first data |
| slicer | slicerChunkSize |
integer | Fragments inbound data into chunks of the specified size (bytes) |
| limit_data | limitDataBytes |
long | Closes the connection after the specified total bytes have been received |
When multiple fault types are configured on the same profile, they are evaluated in priority order: down > reset_peer > limit_data > slicer > bandwidth > latency.
TCP Chaos REST API
// register a TCP chaos profile for a host
PUT /mockserver/tcpChaos
{ "host": "upstream.svc", "chaos": { "latencyMs": 500, "slicerChunkSize": 64 } }
// register with a TTL (auto-revert after 5 minutes)
PUT /mockserver/tcpChaos
{ "host": "upstream.svc", "chaos": { "down": true }, "ttlMillis": 300000 }
// remove the profile for a host
PUT /mockserver/tcpChaos
{ "host": "upstream.svc", "remove": true }
// clear all TCP chaos profiles
PUT /mockserver/tcpChaos
{ "clear": true }
// read all active TCP chaos profiles
GET /mockserver/tcpChaos
// merge-patch an existing profile (only update specified fields)
PATCH /mockserver/tcpChaos
{ "host": "upstream.svc", "chaos": { "down": true } }
The API follows the same patterns as service-scoped HTTP chaos: host matching is case-insensitive and ignores port suffixes, registrations support optional TTL-based auto-expiry, and all registrations are cleared on server reset.
Difference from HTTP chaos: TCP chaos operates at the raw byte level before HTTP decoding. This means it can simulate faults that are impossible to reproduce with HTTP-level chaos, such as TCP RST, connection timeouts (never sending FIN), bandwidth throttling, and data fragmentation. HTTP chaos operates on decoded HTTP requests/responses and can inject application-level faults like error status codes, body corruption, and rate limiting.
LLM Chaos (fault injection for mocked LLM responses)
When mocking an LLM/chat-completion endpoint, attach an LLM chaos profile to a mocked response to test how an AI agent or application copes with provider faults: probabilistic provider errors (e.g. a 429 or 529 with a Retry-After header), mid-stream truncation of a streaming (SSE) response, malformed SSE chunks, and a stateful request quota (a fixed-window rate limit). This is a separate profile from the HTTP HttpChaosProfile above — it understands LLM streaming physics — and it is carried on the LLM response (LlmChaosProfile in the Java client).
LLM Chaos Profile Fields
| Field | Type | Description | Valid range |
|---|---|---|---|
errorStatus |
integer | HTTP error status to return instead of the normal response (e.g. 429, 529, 500). Fires on every request unless errorProbability is set. |
100–599 |
errorProbability |
number | Probability of injecting the error. 1.0 = always, 0.0/omitted = never. A fractional value draws once per response. |
0.0–1.0 |
retryAfter |
string | Value for the Retry-After header on an injected error (probabilistic or quota). |
— |
truncateMode |
string | Streaming truncation mode: NONE (default) or MID_STREAM to cut the SSE event stream short. |
NONE | MID_STREAM |
truncateAtFraction |
number | Fraction of SSE events to emit before truncating (when truncateMode is MID_STREAM). Deterministic. |
0.0–1.0 |
malformedSse |
boolean | Emit a malformed (broken-JSON) SSE chunk so the client's stream parser must cope with corruption. Deterministic. | — |
seed |
integer | Makes a fractional errorProbability reproducible (a fixed seed yields the same single decision every time). |
— |
quotaName |
string | Stateful quota bucket name. Mocked responses sharing the same quotaName share one request counter. |
— |
quotaLimit |
integer | Maximum requests allowed per window before requests are rejected. | ≥ 1 |
quotaWindowMillis |
integer | Fixed-window length in milliseconds for the quota counter. | ≥ 1 |
quotaErrorStatus |
integer | HTTP status returned when the quota is exceeded. | 100–599 (default 429) |
Determinism: with errorProbability of 1.0 or 0.0/omitted the error decision is fully deterministic; a fractional probability draws once per response (set seed to fix it). Truncation and malformed-SSE are always deterministic. The quota is a real cross-request counter — responses sharing a quotaName share one bucket — so requests beyond quotaLimit within the window are rejected with quotaErrorStatus (default 429) and the retryAfter header.
Attaching an LLM chaos profile
In the Java client, build an LlmChaosProfile and attach it to the LLM response with withChaos(...):
import static org.mockserver.model.LlmChaosProfile.llmChaosProfile;
new MockServerClient("localhost", 1080)
.when(/* request matcher for your LLM endpoint */)
.respond(
httpLlmResponse() // your mocked LLM/chat-completion response
.withChaos(
llmChaosProfile()
.withErrorStatus(529) // overloaded
.withErrorProbability(0.3) // 30% of requests
.withRetryAfter("2")
.withTruncateMode(LlmChaosProfile.TruncateMode.MID_STREAM)
.withTruncateAtFraction(0.5) // keep first half of the SSE stream
.withMalformedSse(true)
.withQuotaName("openai") // shared rate-limit bucket
.withQuotaLimit(100)
.withQuotaWindowMillis(60000) // 100 requests / minute
)
);
For AI assistants, the same profile is supplied as the optional chaos object on the mock_llm_completion MCP tool (and per-turn on conversation mocks). In JSON it carries the identical field names:
{
"chaos": {
"errorStatus": 529,
"errorProbability": 0.3,
"retryAfter": "2",
"truncateMode": "MID_STREAM",
"truncateAtFraction": 0.5,
"malformedSse": true,
"quotaName": "openai",
"quotaLimit": 100,
"quotaWindowMillis": 60000
}
}
Fleet-wide Chaos in a Clustered Deployment
In a single-node deployment, chaos registrations live in memory on that node. When MockServer runs as a clustered HA fleet (a shared, replicated state backend across several nodes), the control-plane chaos registrations — service-scoped HTTP chaos (PUT /mockserver/serviceChaos), TCP chaos (PUT /mockserver/tcpChaos) and gRPC chaos (PUT /mockserver/grpcChaos) — replicate across all nodes. Register a fault once on any node and every node enforces it; remove or clear it on any node and it is reverted fleet-wide. TTL auto-revert and clearing on server reset apply consistently across the cluster, so you can break a host for the whole fleet with a single call. With no clustered backend configured (the default) this is a no-op and chaos stays node-local.
Roadmap
The following chaos features are planned for future releases:
- Advanced stateful fault injection — token-bucket rate limiting (429 that depletes and refills) and circuit-breaker windows
Examples
The following examples show how to add chaos/fault injection to expectations using MockServer.
Inject a 503 Service Unavailable error on 30% of matched requests, with a Retry-After header telling the client to wait 30 seconds before retrying. Useful for testing retry logic and backoff strategies.
new MockServerClient("localhost", 1080)
.when(
request()
.withPath("/api/service")
)
.withChaos(
httpChaosProfile()
.withErrorStatus(503)
.withErrorProbability(0.3)
.withRetryAfter("30")
)
.respond(
response()
.withStatusCode(200)
.withBody("{\"status\":\"ok\"}")
);
var mockServerClient = require('mockserver-client').mockServerClient;
mockServerClient("localhost", 1080).mockAnyResponse({
"httpRequest": {
"path": "/api/service"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"status\":\"ok\"}"
},
"chaos": {
"errorStatus": 503,
"errorProbability": 0.3,
"retryAfter": "30"
}
}).then(
function () { console.log("expectation created"); },
function (error) { console.log(error); }
);
from mockserver import MockServerClient, HttpRequest, HttpResponse, HttpChaosProfile
client = MockServerClient("localhost", 1080)
client.when(
HttpRequest.request("/api/service")
).with_chaos(
HttpChaosProfile(error_status=503, error_probability=0.3, retry_after="30")
).respond(
HttpResponse.response('{"status":"ok"}', status_code=200)
)
require 'mockserver-client'
client = MockServer::Client.new('localhost', 1080)
client.when(
MockServer::HttpRequest.request(path: '/api/service')
).with_chaos(
MockServer::HttpChaosProfile.new(error_status: 503, error_probability: 0.3, retry_after: '30')
).respond(
MockServer::HttpResponse.response(body: '{"status":"ok"}', status_code: 200)
)
curl -v -X PUT "http://localhost:1080/mockserver/expectation" -d '{
"httpRequest": {
"path": "/api/service"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"status\":\"ok\"}"
},
"chaos": {
"errorStatus": 503,
"errorProbability": 0.3,
"retryAfter": "30"
}
}'
See REST API for full JSON specification
Always return a 429 Too Many Requests error with a Retry-After header. Set errorProbability to 1.0 for deterministic behaviour (every request is rejected). Useful for testing how your application handles rate limiting from third-party APIs.
new MockServerClient("localhost", 1080)
.when(
request()
.withPath("/api/external-service")
)
.withChaos(
httpChaosProfile()
.withErrorStatus(429)
.withErrorProbability(1.0)
.withRetryAfter("60")
)
.respond(
response()
.withStatusCode(200)
.withBody("{\"data\":\"result\"}")
);
var mockServerClient = require('mockserver-client').mockServerClient;
mockServerClient("localhost", 1080).mockAnyResponse({
"httpRequest": {
"path": "/api/external-service"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"data\":\"result\"}"
},
"chaos": {
"errorStatus": 429,
"errorProbability": 1.0,
"retryAfter": "60"
}
}).then(
function () { console.log("expectation created"); },
function (error) { console.log(error); }
);
from mockserver import MockServerClient, HttpRequest, HttpResponse, HttpChaosProfile
client = MockServerClient("localhost", 1080)
client.when(
HttpRequest.request("/api/external-service")
).with_chaos(
HttpChaosProfile(error_status=429, error_probability=1.0, retry_after="60")
).respond(
HttpResponse.response('{"data":"result"}', status_code=200)
)
require 'mockserver-client'
client = MockServer::Client.new('localhost', 1080)
client.when(
MockServer::HttpRequest.request(path: '/api/external-service')
).with_chaos(
MockServer::HttpChaosProfile.new(error_status: 429, error_probability: 1.0, retry_after: '60')
).respond(
MockServer::HttpResponse.response(body: '{"data":"result"}', status_code: 200)
)
curl -v -X PUT "http://localhost:1080/mockserver/expectation" -d '{
"httpRequest": {
"path": "/api/external-service"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"data\":\"result\"}"
},
"chaos": {
"errorStatus": 429,
"errorProbability": 1.0,
"retryAfter": "60"
}
}'
See REST API for full JSON specification
Add artificial latency to responses without changing the status code. Useful for testing timeout handling, slow-response UX, and deadline propagation across services.
new MockServerClient("localhost", 1080)
.when(
request()
.withPath("/api/slow-dependency")
)
.withChaos(
httpChaosProfile()
.withLatency(
new Delay(TimeUnit.MILLISECONDS, 2000)
)
)
.respond(
response()
.withStatusCode(200)
.withBody("{\"result\":\"delayed\"}")
);
var mockServerClient = require('mockserver-client').mockServerClient;
mockServerClient("localhost", 1080).mockAnyResponse({
"httpRequest": {
"path": "/api/slow-dependency"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"result\":\"delayed\"}"
},
"chaos": {
"latency": {
"timeUnit": "MILLISECONDS",
"value": 2000
}
}
}).then(
function () { console.log("expectation created"); },
function (error) { console.log(error); }
);
from mockserver import MockServerClient, HttpRequest, HttpResponse, HttpChaosProfile, Delay
client = MockServerClient("localhost", 1080)
client.when(
HttpRequest.request("/api/slow-dependency")
).with_chaos(
HttpChaosProfile(latency=Delay(time_unit="MILLISECONDS", value=2000))
).respond(
HttpResponse.response('{"result":"delayed"}', status_code=200)
)
require 'mockserver-client'
client = MockServer::Client.new('localhost', 1080)
client.when(
MockServer::HttpRequest.request(path: '/api/slow-dependency')
).with_chaos(
MockServer::HttpChaosProfile.new(latency: MockServer::Delay.new(time_unit: 'MILLISECONDS', value: 2000))
).respond(
MockServer::HttpResponse.response(body: '{"result":"delayed"}', status_code: 200)
)
curl -v -X PUT "http://localhost:1080/mockserver/expectation" -d '{
"httpRequest": {
"path": "/api/slow-dependency"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"result\":\"delayed\"}"
},
"chaos": {
"latency": {
"timeUnit": "MILLISECONDS",
"value": 2000
}
}
}'
See REST API for full JSON specification
Combine error injection and latency injection in a single chaos profile. The seed makes the probabilistic error decision reproducible across runs — the same seed always produces the same inject/skip outcome for a given probability.
new MockServerClient("localhost", 1080)
.when(
request()
.withPath("/api/payments")
)
.withChaos(
httpChaosProfile()
.withErrorStatus(500)
.withErrorProbability(0.1)
.withRetryAfter("10")
.withLatency(
new Delay(TimeUnit.MILLISECONDS, 500)
)
.withSeed(42L)
)
.respond(
response()
.withStatusCode(200)
.withBody("{\"transactionId\":\"abc-123\"}")
);
var mockServerClient = require('mockserver-client').mockServerClient;
mockServerClient("localhost", 1080).mockAnyResponse({
"httpRequest": {
"path": "/api/payments"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"transactionId\":\"abc-123\"}"
},
"chaos": {
"errorStatus": 500,
"errorProbability": 0.1,
"retryAfter": "10",
"latency": {
"timeUnit": "MILLISECONDS",
"value": 500
},
"seed": 42
}
}).then(
function () { console.log("expectation created"); },
function (error) { console.log(error); }
);
from mockserver import MockServerClient, HttpRequest, HttpResponse, HttpChaosProfile, Delay
client = MockServerClient("localhost", 1080)
client.when(
HttpRequest.request("/api/payments")
).with_chaos(
HttpChaosProfile(
error_status=500,
error_probability=0.1,
retry_after="10",
latency=Delay(time_unit="MILLISECONDS", value=500),
seed=42
)
).respond(
HttpResponse.response('{"transactionId":"abc-123"}', status_code=200)
)
require 'mockserver-client'
client = MockServer::Client.new('localhost', 1080)
client.when(
MockServer::HttpRequest.request(path: '/api/payments')
).with_chaos(
MockServer::HttpChaosProfile.new(
error_status: 500,
error_probability: 0.1,
retry_after: '10',
latency: MockServer::Delay.new(time_unit: 'MILLISECONDS', value: 500),
seed: 42
)
).respond(
MockServer::HttpResponse.response(body: '{"transactionId":"abc-123"}', status_code: 200)
)
curl -v -X PUT "http://localhost:1080/mockserver/expectation" -d '{
"httpRequest": {
"path": "/api/payments"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"transactionId\":\"abc-123\"}"
},
"chaos": {
"errorStatus": 500,
"errorProbability": 0.1,
"retryAfter": "10",
"latency": {
"timeUnit": "MILLISECONDS",
"value": 500
},
"seed": 42
}
}'
See REST API for full JSON specification
Chaos profiles work on forwarded (proxied) responses too — not just mocked responses. This lets you inject faults into calls to real upstream services, making MockServer a chaos proxy for testing how your application handles unreliable dependencies.
new MockServerClient("localhost", 1080)
.when(
request()
.withPath("/api/upstream-service")
)
.withChaos(
httpChaosProfile()
.withErrorStatus(503)
.withErrorProbability(0.5)
.withLatency(
new Delay(TimeUnit.MILLISECONDS, 1000)
)
)
.forward(
forward()
.withHost("upstream.example.com")
.withPort(443)
.withScheme(HttpForward.Scheme.HTTPS)
);
var mockServerClient = require('mockserver-client').mockServerClient;
mockServerClient("localhost", 1080).mockAnyResponse({
"httpRequest": {
"path": "/api/upstream-service"
},
"httpForward": {
"host": "upstream.example.com",
"port": 443,
"scheme": "HTTPS"
},
"chaos": {
"errorStatus": 503,
"errorProbability": 0.5,
"latency": {
"timeUnit": "MILLISECONDS",
"value": 1000
}
}
}).then(
function () { console.log("expectation created"); },
function (error) { console.log(error); }
);
from mockserver import MockServerClient, HttpRequest, HttpChaosProfile, HttpForward, Delay
client = MockServerClient("localhost", 1080)
client.when(
HttpRequest.request("/api/upstream-service")
).with_chaos(
HttpChaosProfile(
error_status=503,
error_probability=0.5,
latency=Delay(time_unit="MILLISECONDS", value=1000)
)
).forward(
HttpForward(host="upstream.example.com", port=443, scheme="HTTPS")
)
require 'mockserver-client'
client = MockServer::Client.new('localhost', 1080)
client.when(
MockServer::HttpRequest.request(path: '/api/upstream-service')
).with_chaos(
MockServer::HttpChaosProfile.new(
error_status: 503,
error_probability: 0.5,
latency: MockServer::Delay.new(time_unit: 'MILLISECONDS', value: 1000)
)
).forward(
MockServer::HttpForward.new(host: 'upstream.example.com', port: 443, scheme: 'HTTPS')
)
curl -v -X PUT "http://localhost:1080/mockserver/expectation" -d '{
"httpRequest": {
"path": "/api/upstream-service"
},
"httpForward": {
"host": "upstream.example.com",
"port": 443,
"scheme": "HTTPS"
},
"chaos": {
"errorStatus": 503,
"errorProbability": 0.5,
"latency": {
"timeUnit": "MILLISECONDS",
"value": 1000
}
}
}'
See REST API for full JSON specification
Use failRequestCount to make the first N requests fail, then recover automatically. This is ideal for testing that your client retries and eventually succeeds. In this example the first 2 matching requests return a 503 error; from request 3 onward the normal 200 response is returned.
new MockServerClient("localhost", 1080)
.when(
request()
.withPath("/api/service")
)
.withChaos(
httpChaosProfile()
.withErrorStatus(503)
.withErrorProbability(1.0)
.withFailRequestCount(2)
)
.respond(
response()
.withStatusCode(200)
.withBody("{\"status\":\"ok\"}")
);
var mockServerClient = require('mockserver-client').mockServerClient;
mockServerClient("localhost", 1080).mockAnyResponse({
"httpRequest": {
"path": "/api/service"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"status\":\"ok\"}"
},
"chaos": {
"errorStatus": 503,
"errorProbability": 1.0,
"failRequestCount": 2
}
}).then(
function () { console.log("expectation created"); },
function (error) { console.log(error); }
);
from mockserver import MockServerClient, HttpRequest, HttpResponse, HttpChaosProfile
client = MockServerClient("localhost", 1080)
client.when(
HttpRequest.request("/api/service")
).with_chaos(
HttpChaosProfile(error_status=503, error_probability=1.0, fail_request_count=2)
).respond(
HttpResponse.response('{"status":"ok"}', status_code=200)
)
require 'mockserver-client'
client = MockServer::Client.new('localhost', 1080)
client.when(
MockServer::HttpRequest.request(path: '/api/service')
).with_chaos(
MockServer::HttpChaosProfile.new(error_status: 503, error_probability: 1.0, fail_request_count: 2)
).respond(
MockServer::HttpResponse.response(body: '{"status":"ok"}', status_code: 200)
)
curl -v -X PUT "http://localhost:1080/mockserver/expectation" -d '{
"httpRequest": {
"path": "/api/service"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"status\":\"ok\"}"
},
"chaos": {
"errorStatus": 503,
"errorProbability": 1.0,
"failRequestCount": 2
}
}'
See REST API for full JSON specification
Setting errorProbability to 1.0 makes every matched request return the error. This simulates a complete outage of the downstream service — useful for game-day exercises and verifying circuit-breaker behaviour.
new MockServerClient("localhost", 1080)
.when(
request()
.withPath("/api/critical-service")
)
.withChaos(
httpChaosProfile()
.withErrorStatus(503)
.withErrorProbability(1.0)
.withRetryAfter("120")
)
.respond(
response()
.withStatusCode(200)
.withBody("{\"healthy\":true}")
);
var mockServerClient = require('mockserver-client').mockServerClient;
mockServerClient("localhost", 1080).mockAnyResponse({
"httpRequest": {
"path": "/api/critical-service"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"healthy\":true}"
},
"chaos": {
"errorStatus": 503,
"errorProbability": 1.0,
"retryAfter": "120"
}
}).then(
function () { console.log("expectation created"); },
function (error) { console.log(error); }
);
from mockserver import MockServerClient, HttpRequest, HttpResponse, HttpChaosProfile
client = MockServerClient("localhost", 1080)
client.when(
HttpRequest.request("/api/critical-service")
).with_chaos(
HttpChaosProfile(error_status=503, error_probability=1.0, retry_after="120")
).respond(
HttpResponse.response('{"healthy":true}', status_code=200)
)
require 'mockserver-client'
client = MockServer::Client.new('localhost', 1080)
client.when(
MockServer::HttpRequest.request(path: '/api/critical-service')
).with_chaos(
MockServer::HttpChaosProfile.new(error_status: 503, error_probability: 1.0, retry_after: '120')
).respond(
MockServer::HttpResponse.response(body: '{"healthy":true}', status_code: 200)
)
curl -v -X PUT "http://localhost:1080/mockserver/expectation" -d '{
"httpRequest": {
"path": "/api/critical-service"
},
"httpResponse": {
"statusCode": 200,
"body": "{\"healthy\":true}"
},
"chaos": {
"errorStatus": 503,
"errorProbability": 1.0,
"retryAfter": "120"
}
}'
See REST API for full JSON specification
FAQ
Add a chaos profile to your expectation with errorStatus set to 429, errorProbability set to 1.0 (for every request) or a fraction like 0.3 (for 30% of requests), and an optional retryAfter value. The chaos profile works on both mocked responses and forwarded/proxied upstream calls. See the rate limiting example above.
Add a chaos profile with a latency field specifying the delay duration and time unit (e.g. 2000 milliseconds). Latency is injected into every matched response regardless of whether an error is also injected. This works on both mocked and forwarded responses. See the latency injection example above.
Yes. Chaos profiles apply to forwarded and proxied responses, not just mocked ones. Set up a forward expectation pointing at your real upstream service and attach a chaos profile to inject errors or latency into the responses MockServer returns to the caller. See the forward/proxy example and the Chaos Proxy section above.
Set the seed field on your chaos profile to a fixed value (e.g. 42). With the same seed, a given errorProbability always produces the same inject-or-skip decision, making fractional-probability chaos deterministic across test runs. See the Reproducibility section above.
Yes. Deploy MockServer as a sidecar proxy, egress proxy, or reverse proxy in your Kubernetes cluster and attach chaos profiles to inject faults into the traffic flowing through it. MockServer operates at the HTTP layer (L7) and requires explicit routing (e.g. HTTP_PROXY environment variable or Service rewrite). See Chaos Proxy in Kubernetes for sidecar/egress/reverse-proxy deployment patterns, and Isolating Single Service for general proxy setup.
Use the succeedFirst and failRequestCount fields on the chaos profile. Set succeedFirst to 0 (or omit it) and failRequestCount to the number of requests that should fail. For example, failRequestCount: 2 with errorStatus: 503 and errorProbability: 1.0 makes the first 2 matching requests return 503, and all subsequent requests return the normal response. This is useful for testing retry logic and backoff strategies. See the fail-then-recover example and the Stateful / Count-Based Faults section above.
Set graphqlErrors: true and graphqlNullifyData: false on your chaos profile. MockServer will try to parse the original response body as JSON and embed it as the data value in the GraphQL error envelope, resulting in a response like {"data":{...},"errors":[{"message":"..."}]}. If the original body is not valid JSON, data falls back to null. Use graphqlErrorMessage and graphqlErrorCode to customise the error entry. This works on both expectation-level chaos and service-scoped chaos (PUT /mockserver/serviceChaos). See the GraphQL Error Injection section above.
Set omitGrpcStatus: true on a gRPC chaos profile. MockServer will send the HTTP 200 response with a content-type: application/grpc header but deliberately omit the grpc-status trailer. Most gRPC clients treat this as a protocol error or stream reset — which is exactly what you want to test to ensure the client does not silently accept an incomplete RPC as a success. Register the profile via PUT /mockserver/grpcChaos with {"service": "...", "chaos": {"omitGrpcStatus": true}}. See the gRPC Fault Injection section above.
Related Pages
- Service Mesh / Sidecar Mode — deploy MockServer in-path as a transparent sidecar so you can inject the faults above into live traffic without changing application config
- Transparent Interception Recipes — iptables REDIRECT / TPROXY recipes for routing real traffic through MockServer to apply chaos profiles
- Chaos Proxy in Kubernetes — sidecar, egress, and reverse-proxy deployment patterns for injecting faults in a cluster
- Driving MockServer from Chaos Orchestrators — Chaos Toolkit, AWS FIS, Azure Chaos Studio and LitmusChaos recipes against the control plane