MockServer provides two observability channels: Prometheus metrics for counters, gauges, and histograms; and OpenTelemetry (OTLP) for trace and metric export. Both are opt-in and have zero overhead when disabled.

 

Prometheus Metrics

Enable Prometheus metrics by setting metricsEnabled to true. MockServer then exposes a scrape endpoint at /mockserver/metrics in Prometheus text exposition format. When metrics are disabled, this endpoint returns 404.

# Start MockServer with metrics enabled
docker run --rm -p 1080:1080 \
  -e MOCKSERVER_METRICS_ENABLED=true \
  mockserver/mockserver:7.1.0

# Scrape metrics
curl http://localhost:1080/mockserver/metrics

Available metrics

Naming convention: the core request-tracking gauges are exposed with unprefixed names (e.g. requests_received_count, expectations_not_matched_count). Counter and histogram metrics, and the operational gauges (mock_server_active_service_chaos, mock_server_expectations_by_type, mock_server_build_info), all use a mock_server_ prefix (e.g. mock_server_request_duration_seconds). Note that the _total suffix is appended to counter names in the exposition output, so mock_server_http_chaos_injected appears as mock_server_http_chaos_injected_total on the /mockserver/metrics endpoint and in PromQL queries and Grafana.

Request and expectation matching

MetricTypeDescription
requests_received_countGaugeTotal requests received
expectations_not_matched_countGaugeRequests that did not match any expectation
response_expectations_matched_countGaugeRequests matched to a response expectation
forward_expectations_matched_countGaugeRequests matched to a forward expectation

Action execution (one per action type)

MetricDescription
response_actions_countResponse actions executed
forward_actions_countForward actions executed
sse_response_actions_countSSE response actions executed
llm_response_actions_countLLM response actions executed
error_actions_countError actions executed
grpc_stream_response_actions_countgRPC stream response actions executed

Additional action counters exist for template, callback, and other action types. See the full list by scraping the endpoint.

Request latency histogram

mock_server_request_duration_seconds is a Prometheus histogram of request handling duration (receipt to response), with buckets from 0.5 ms to 10 s. Use it to derive latency percentiles:

histogram_quantile(0.95, sum by (le) (rate(mock_server_request_duration_seconds_bucket[1m])))

Build info

mock_server_build_info is a gauge with labels version, major_minor_version, group_id, artifact_id, and git_hash.

JVM runtime

When metrics are enabled, MockServer also exposes JVM health gauges:

MetricLabelsDescription
jvm_memory_used_bytesarea = heap / nonheapMemory currently used
jvm_memory_committed_bytesareaMemory committed by the JVM
jvm_memory_max_bytesareaMax memory (-1 if undefined)
jvm_threads_currentLive thread count
jvm_threads_daemonDaemon thread count
jvm_gc_collection_countTotal GC collections
jvm_gc_collection_seconds_sumTotal GC time in seconds

Chaos metrics

When chaos testing is active, additional metrics track fault injection:

  • mock_server_http_chaos_injected_total — counter with a fault_type label (drop, error, latency, truncate, malformed, slow, quota, graphql)
  • mock_server_active_service_chaos — gauge per fault_type of currently active chaos profiles
  • mock_server_chaos_auto_halt_total — counter that increments each time the chaos auto-halt circuit-breaker triggers
 

LLM Token and Cost Metrics

When both metricsEnabled and llmMetricsEnabled are true, three additional Prometheus counters track LLM usage:

MetricLabelsDescription
mock_server_llm_input_tokens_total provider, model Cumulative input tokens
mock_server_llm_output_tokens_total provider, model Cumulative output tokens
mock_server_llm_cost_usd_total provider, model Cumulative estimated cost in USD

These counters are incremented on both the mock path (when MockServer serves an httpLlmResponse) and the forward/proxy path (when MockServer forwards requests to a real LLM provider). Cost estimation uses an internal pricing table and is approximate.

The cost-budget circuit-breaker (mock_server_llm_cost_budget_tripped_total counter) is documented in LLM Response Mocking → Cost Budget.

# Example: total LLM cost rate per hour
sum(rate(mock_server_llm_cost_usd_total[1h]))
 

OpenTelemetry (OTLP) Export

MockServer can push metrics and traces to an OpenTelemetry Collector (or any OTLP-compatible backend) via OTLP HTTP/protobuf. Set the collector endpoint and enable the signals you want:

docker run --rm -p 1080:1080 \
  -e MOCKSERVER_OTEL_ENDPOINT=http://otel-collector:4318 \
  -e MOCKSERVER_OTEL_METRICS_ENABLED=true \
  -e MOCKSERVER_OTEL_TRACES_ENABLED=true \
  -e MOCKSERVER_METRICS_ENABLED=true \
  mockserver/mockserver:7.1.0

otelEndpoint is the base URL of the OTLP HTTP collector. MockServer appends /v1/metrics and /v1/traces automatically.

Metrics export interval: otelMetricsExportIntervalSeconds controls how often metrics are pushed (default 60 seconds, minimum 1 second).

 

GenAI Spans

When otelTracesEnabled is true, MockServer emits OpenTelemetry GenAI semantic-convention spans for LLM completions. Each span includes:

  • gen_ai.system — the provider (e.g. openai, anthropic)
  • gen_ai.request.model — the model identifier
  • Token usage attributes (input and output tokens)
  • Finish reason

GenAI spans fire on two paths:

  • Mock path — when MockServer serves an httpLlmResponse
  • Forward/proxy path — when MockServer forwards requests to a real LLM provider. The provider is detected from the target host (e.g. api.openai.com maps to OpenAI, api.anthropic.com maps to Anthropic).
 

W3C Trace Context Propagation

MockServer can extract and propagate W3C traceparent and tracestate headers across requests and responses. This enables distributed tracing correlation when MockServer sits in a service mesh or test harness.

  • otelPropagateTraceContext (default false) — when enabled, MockServer copies the incoming trace context headers to the response, so downstream tracing tooling can correlate the mock response with the original request trace.
  • otelGenerateTraceId (default false) — when enabled, MockServer generates a new random W3C trace ID for requests that arrive without a traceparent header.
 

Configuration Reference

Prometheus

PropertyEnv varDefaultDescription
mockserver.metricsEnabled MOCKSERVER_METRICS_ENABLED false Enable Prometheus metrics and the /mockserver/metrics endpoint
mockserver.llmMetricsEnabled MOCKSERVER_LLM_METRICS_ENABLED false Enable LLM token/cost counters (requires metricsEnabled)

OpenTelemetry

PropertyEnv varDefaultDescription
mockserver.otelEndpoint MOCKSERVER_OTEL_ENDPOINT (empty) OTLP collector base URL (e.g. http://collector:4318)
mockserver.otelMetricsEnabled MOCKSERVER_OTEL_METRICS_ENABLED false Push metrics to OTLP
mockserver.otelTracesEnabled MOCKSERVER_OTEL_TRACES_ENABLED false Export GenAI spans via OTLP
mockserver.otelMetricsExportIntervalSeconds MOCKSERVER_OTEL_METRICS_EXPORT_INTERVAL_SECONDS 60 OTLP metrics push interval in seconds (minimum 1)
mockserver.otelPropagateTraceContext MOCKSERVER_OTEL_PROPAGATE_TRACE_CONTEXT false Copy W3C trace context headers to responses
mockserver.otelGenerateTraceId MOCKSERVER_OTEL_GENERATE_TRACE_ID false Generate trace IDs for requests without traceparent

Related Pages