Network Latency Debugging

When a service is slow or intermittently slow, the first step is usually to understand where the time goes for each HTTP request. MockServer can act as an in-path HTTP(S) proxy that records per-request timing breakdowns — connection time, time to first byte (TTFB), and total round-trip — so you can identify whether latency is in the network, the upstream server's processing, or the response transfer.

This page covers how to use MockServer's proxy timing features for latency diagnosis. It complements Debugging Proxied Traffic (which focuses on inspecting request/response content) and Chaos Proxy in Kubernetes (which injects faults for resilience testing).

Scope: MockServer operates at the HTTP layer (L7). It captures timing from the perspective of MockServer's own outbound connection to the upstream server — TCP connect, TLS handshake (included in connect time), server processing (TTFB), and response body transfer. It does not capture network-layer (L3/L4) metrics such as packet-level round-trip times, retransmissions, or congestion-window behaviour. For those, use dedicated packet-capture or TCP-analysis tools alongside MockServer's HTTP-level view.

Per-Request Timing Breakdown

Every time MockServer forwards a request (in proxy mode or via a forward action), it records a timing object on the response with these fields:

Field	What it measures
`connectionTimeInMillis`	Time to establish the TCP connection (and TLS handshake, if HTTPS) to the upstream server.
`timeToFirstByteInMillis`	Time from the start of the request until the first byte of the response is received. This is the connect time plus the upstream server's processing time.
`totalTimeInMillis`	Total round-trip time from sending the request to receiving the complete response body.

These fields are available on every recorded request-response pair retrieved via the retrieve API:

curl -v -X PUT "http://localhost:1080/mockserver/retrieve?type=REQUEST_RESPONSES&format=JSON"

Example response with timing:

{
    "httpRequest": {
        "method": "GET",
        "path": "/api/users"
    },
    "httpResponse": {
        "statusCode": 200,
        "body": "...",
        "timing": {
            "connectionTimeInMillis": 12,
            "timeToFirstByteInMillis": 85,
            "totalTimeInMillis": 142
        }
    }
}

From these numbers you can derive the three phases of the request:

Connect (12ms) — TCP + TLS setup. High values suggest network latency or DNS resolution delays.
Wait / server processing (85ms − 12ms = 73ms) — the time the upstream spent processing the request before it started sending the response. This is the "thinking time" and is typically where application-level slowness shows up.
Receive / body transfer (142ms − 85ms = 57ms) — the time to transfer the response body. High values point to large payloads or bandwidth constraints.

HAR Export with Timing Breakdown

When you export recorded traffic as a HAR 1.2 file, MockServer populates the standard timings object on each entry with connect, wait, and receive values derived from the per-request timing fields. This means you can open the HAR file in browser DevTools or any HTTP-analysis tool and see a familiar waterfall chart with accurate timing breakdowns.

curl -v -X PUT "http://localhost:1080/mockserver/retrieve?type=REQUEST_RESPONSES&format=HAR" -o recording.har

You can narrow the export to specific paths or hosts by sending a request matcher in the body:

curl -v -X PUT "http://localhost:1080/mockserver/retrieve?type=REQUEST_RESPONSES&format=HAR" \
  -d '{"path": "/api/.*"}' -o api-traffic.har

The HAR timings section for each entry will look like:

"timings": {
    "connect": 12,
    "wait": 73,
    "receive": 57
}

This is especially useful for sharing a capture with teammates or loading it into performance-analysis tools.

Flagging Slow Requests

When you are looking for intermittently slow requests rather than inspecting every single one, you can set a threshold so that MockServer automatically flags any forwarded request that exceeds it. This saves you from manually scanning through hundreds of timing entries.

Set the slowRequestThresholdMillis property to a value greater than 0. Any forwarded request whose total time exceeds this threshold will:

Emit a WARN-level log entry in MockServer's event log, visible in the dashboard UI and retrievable via the log retrieval API.
Increment the mock_server_slow_requests_total Prometheus counter (when metrics are enabled), so you can alert on slow-request rates.

Configuration examples:

# System property
-Dmockserver.slowRequestThresholdMillis=500

# Environment variable (e.g. Docker / Kubernetes)
MOCKSERVER_SLOW_REQUEST_THRESHOLD_MILLIS=500

# Property file
mockserver.slowRequestThresholdMillis=500

When a request exceeds the threshold, a log entry like this appears:

WARN - slow forwarded request GET /api/users took 1230ms (threshold 500ms)

The default is 0, which disables slow-request flagging entirely.

Latency Percentiles via Prometheus

MockServer exposes a mock_server_request_duration_seconds histogram via the GET /mockserver/metrics Prometheus endpoint (when metrics are enabled). This histogram records the duration of every request MockServer handles — mocked, proxied/forwarded, and control-plane API calls alike — so you can compute p50, p95, and p99 latency percentiles in your monitoring system. (By contrast, the slow-request flagging above is scoped to forwarded requests only.)

For more granular analysis, enable the metricsRequestDurationRouteLabels property. This adds a per-HTTP-method histogram (mock_server_request_duration_by_method_seconds) with a method label (GET, POST, PUT, etc.), letting you compare latency distributions across different types of requests.

# Enable per-method latency histogram
-Dmockserver.metricsRequestDurationRouteLabels=true

# or via environment variable
MOCKSERVER_METRICS_REQUEST_DURATION_ROUTE_LABELS=true

Example Prometheus queries:

# p95 latency across all requests (15-minute window)
histogram_quantile(0.95, rate(mock_server_request_duration_seconds_bucket[15m]))

# p95 latency for GET requests specifically
histogram_quantile(0.95, rate(mock_server_request_duration_by_method_seconds_bucket{method="GET"}[15m]))

# Slow request rate (per second, 5-minute window)
rate(mock_server_slow_requests_total[5m])

Cardinality is bounded to the set of standard HTTP methods, so enabling route labels does not risk a cardinality explosion.

In-Path Capture in Kubernetes

To capture latency data for a specific dependency in Kubernetes, deploy MockServer between your application and the upstream service using any of the deployment patterns described in Chaos Proxy in Kubernetes:

Reverse proxy (Pattern 1) — a Kubernetes Service that routes calls to MockServer, which forwards to the real backend.
Egress proxy (Pattern 2) — set HTTP_PROXY/HTTPS_PROXY on your application to route outbound traffic through MockServer.
Sidecar (Pattern 3) — run MockServer as a sidecar container in the same pod.

In all cases, enable timing and metrics with environment variables on the MockServer container:

env:
  - name: MOCKSERVER_METRICS_ENABLED
    value: "true"
  - name: MOCKSERVER_SLOW_REQUEST_THRESHOLD_MILLIS
    value: "500"
  - name: MOCKSERVER_METRICS_REQUEST_DURATION_ROUTE_LABELS
    value: "true"

Then scrape the /mockserver/metrics endpoint from your Prometheus instance and export HAR files or retrieve recorded timings via the API as needed.

The same deployment can serve both latency diagnostics and fault injection at the same time — add chaos profiles to specific forward expectations to test resilience while simultaneously collecting timing data on all traffic. See Chaos Proxy in Kubernetes for the chaos configuration.

What This Does Not Cover

MockServer's timing operates at the HTTP layer (L7). This means it captures the timing characteristics visible to an HTTP client — connection setup, server processing, and response transfer. It does not provide:

Packet-level metrics — TCP retransmissions, congestion window sizes, or packet loss rates. Use a packet-capture tool for these.
Network-layer (L3/L4) visibility — MockServer does not intercept traffic transparently; it must be explicitly placed in the request path.
Client-side timing — the timing reflects MockServer's outbound connection to the upstream. It does not include the time between the original client and MockServer itself.
DNS resolution time — DNS lookups are included in the connect time but not broken out separately.

For a complete network-level investigation, combine MockServer's HTTP timing data with packet-capture or TCP-analysis tools to get both the application-layer and transport-layer views.

Debugging Proxied Traffic — inspect and export the content of recorded proxy traffic (requests, responses, headers, bodies)
Logging & Debugging — retrieve logs, metrics, and configuration
Chaos Proxy in Kubernetes — deploy MockServer in Kubernetes for fault injection and latency capture
Configuration Properties — all properties including slowRequestThresholdMillis and metricsRequestDurationRouteLabels