MLLP Server

Architecture

How messages flow through the server, from TCP accept to downstream delivery.

Message Flow

Every inbound HL7 message passes through the same deterministic pipeline. The server ACKs the sender before any downstream delivery begins, so connector latency never affects MLLP throughput.

TCP / TLS  →  MLLP framing  →  HL7 parse  →  Validate  →  Route  →  Persist  →  ACK
                                                                         │
                                                                         ▼
                                                              Async delivery workers
                                                              (one per connector)

1. Accept

The server listens on a TCP socket (default :2575). When TLS is configured, connections are upgraded before any data is exchanged. A handshake timeout (CONNECT_TIMEOUT, default 10s) prevents stalled handshakes from holding resources.

If MAX_CONNECTIONS is set and the limit is reached, new connections are refused immediately.

2. Frame

MLLP framing is applied per the HL7 specification (Appendix C): each message is wrapped in a Start Block (0x0B), the HL7 payload, and an End Block + Carriage Return (0x1C 0x0D). The server extracts the payload and discards the framing bytes.

The maximum frame size defaults to 2 MB and is configurable via MAX_FRAME_SIZE (in bytes). Frames that exceed this limit or that take longer than FRAME_TIMEOUT (default 60s) to complete are rejected and the connection is closed.

3. Parse

The MSH (Message Header) segment is parsed to extract routing-relevant fields: message type, trigger event, sending/receiving applications and facilities, control ID, and HL7 version. If the MSH segment is malformed, the server returns an AR (Application Reject) ACK.

4. Validate

If CEL validation rules are configured, each rule is evaluated against the parsed message. Rules have access to fields from the MSH, PID, PV1, and OBX segments. See the Configuration reference for the full list of CEL variables.

Rules are evaluated in order. On the first failure, the server returns an AR ACK with the rule’s error message. If a rule causes an evaluation error (e.g. type mismatch), the server returns an AE (Application Error) ACK instead.

5. Route

Each configured connector has an optional CEL filter expression. The server evaluates every connector’s filter against the message. Connectors without a filter match all messages. A single message can match multiple connectors.

6. Persist

All matching connector deliveries are written to an embedded outbox in a single atomic transaction. Either every connector receives the message or none do — there is no partial delivery on crash or power loss.

Each connector has its own outbox queue with FIFO ordering. Messages are assigned monotonically increasing sequence IDs, ensuring delivery order matches arrival order.

7. ACK

After the message is persisted (or if no connectors matched), the server constructs an HL7 ACK message and sends it back to the client over the same TCP connection. The ACK code reflects the outcome:

CodeMeaningWhen
AAApplication AcceptMessage parsed, validated, and persisted successfully
ARApplication RejectParse failure or validation rule returned false. Permanent — sender should not retry
AEApplication ErrorInternal error during validation or routing. Transient — sender may retry

Async Delivery

After the ACK is sent, delivery to downstream systems happens asynchronously. Each connector runs its own independent delivery worker.

Delivery cycle

  1. The worker reads the oldest message from its outbox queue (without removing it)
  2. The worker attempts delivery (HTTP POST, MLLP forward, database insert, etc.)
  3. On success: the message is removed from the outbox
  4. On failure: the worker waits and retries

Retry behavior

Failed deliveries are retried with exponential backoff:

  • Initial delay: 1 second
  • Maximum delay: 5 minutes
  • Jitter: 0–25% added to each delay to avoid retry storms

The delay doubles with each attempt: 1s, 2s, 4s, 8s, … capped at 5 minutes. After a successful delivery, the delay resets to 1 second for the next message.

Dead letter queue

Messages that exceed retry.max_attempts (default 5) are moved to a per-connector dead letter queue (DLQ). Messages in the DLQ are not retried automatically. Use the CLI tool to inspect, replay, or purge DLQ messages.

Set retry.dead_letter.disabled: true to skip the DLQ and retry indefinitely instead.


Persistence

The outbox is backed by an embedded key-value store (bbolt). The database is a single file on disk (OUTBOX_DB_PATH, default outbox.db).

Each connector gets two queues:

  • Outbox — messages waiting for delivery
  • DLQ — messages that failed after all retry attempts

The database survives process restarts. On startup, workers resume from where they left off — no messages are lost.

Fan-out to multiple connectors is atomic. When a message matches three connectors, all three outbox writes succeed or all three are rolled back. This is a single database transaction, not distributed coordination.


Connection Handling

Idle connections

Connections that go idle for longer than IDLE_TIMEOUT (default 30s) are closed. This prevents abandoned connections from consuming resources.

Graceful shutdown

On SIGINT or SIGTERM:

  1. The server stops accepting new connections
  2. PRE_SHUTDOWN_DELAY elapses (default 0 — set to 3–10s in Kubernetes for load balancer propagation)
  3. Active connections are given up to SHUTDOWN_TIMEOUT (default 30s) to finish processing
  4. Remaining connections are force-closed
  5. The outbox database is flushed and closed
  6. Metrics are flushed
  7. The process exits

A second SIGINT/SIGTERM during shutdown forces an immediate exit.

Signal reference

SignalEffect
SIGINT / SIGTERMGraceful shutdown
SIGHUPReload TLS certificate + rotate log file
SIGUSR1Cycle log level at runtime