CEL expressions for HL7 message validation

Common Expression Language lets you write HL7 validation rules as one-line boolean expressions. Compiled at startup, evaluated per message, mapped to ACK codes.

Common Expression Language (CEL) is a small expression language designed for evaluating boolean conditions. Google created it for Kubernetes admission policies, IAM rules, and security policies. It compiles to bytecode, evaluates in microseconds, and can’t loop or allocate memory. That makes it safe to run on untrusted input.

These same properties make it useful for HL7 message validation. A validation rule is a boolean expression: given the fields of an HL7 message, does this message meet the criteria? True means accept. False means reject.

Why CEL instead of code

The alternative to CEL is writing validation logic in your application language. Parse the message, check the fields, return a result. This works, but it means validation rules live in compiled code. Changing a rule requires a code change, a build, and a deploy.

CEL rules are configuration. They live in a YAML file alongside the rest of the server config. Add a rule, restart the server, done. No compilation step for the operator.

The trade-off is expressiveness. CEL can’t call functions, query databases, or make HTTP requests. If your validation requires looking up a patient in an external system, CEL won’t do it. For field-level checks like required fields, allowed values, and pattern matching, it’s a good fit.

Available fields

CEL rules evaluate against parsed fields from four HL7 segments. Every field is a string. Missing segments return empty maps. Missing lists return empty lists.

MSH (message header)

Key	HL7 field	Example
`msg_type`	MSH-9.1	`ADT`
`trigger`	MSH-9.2	`A01`
`sending_app`	MSH-3	`HIS`
`sending_fac`	MSH-4	`CENTRAL_HOSPITAL`
`receiving_app`	MSH-5	`LAB`
`receiving_fac`	MSH-6	`PATHOLOGY`
`control_id`	MSH-10	`MSG00001`
`version`	MSH-12	`2.3`

PID (patient identity)

Key	HL7 field	Example
`id`	PID-3.1	`123456`
`name`	PID-5	`VIRTANEN^MATTI`
`dob`	PID-7	`19800215`
`sex`	PID-8	`M`
`ssn`	PID-19	`010180-123A`
`country`	PID-11.6	`FI`

PV1 (patient visit)

Key	HL7 field	Example
`patient_class`	PV1-2	`I`
`assigned_location`	PV1-3	`ICU^BED3`
`attending_doctor`	PV1-7	`LAHTINEN^ANNA`
`admit_datetime`	PV1-44	`20260115083000`

OBX (observation)

Key	HL7 field	Example
`value_type`	OBX-2	`NM`
`identifier`	OBX-3	`WBC^White Blood Cell`
`value`	OBX-5	`7.5`
`unit`	OBX-6	`10*9/L`
`status`	OBX-11	`F`

The obx variable contains the first OBX segment. The obx_list variable contains all OBX segments as a list of maps with the same keys.

Writing rules

A rule has three parts: a name, a CEL expression, and an error message. The expression must evaluate to a boolean. Rules are evaluated in order. The first failure short-circuits. No further rules run.

rules:
  - name: require-patient-id
    expression: pid.id != ""
    message: "PID-3.1 (patient ID) is required"

If pid.id is empty, the server returns an AR (Application Reject) ACK with the message “PID-3.1 (patient ID) is required.” The sender knows exactly what’s wrong.

Simple field checks

Require a field to be present:

- name: require-sending-facility
  expression: msh.sending_fac != ""
  message: "MSH-4 (sending facility) is required"

- name: require-control-id
  expression: msh.control_id != ""
  message: "MSH-10 (message control ID) is required"

Restrict to specific values:

- name: adt-only
  expression: msh.msg_type == "ADT"
  message: "Only ADT messages accepted at this endpoint"

- name: allowed-versions
  expression: msh.version == "2.3" || msh.version == "2.4" || msh.version == "2.5"
  message: "HL7 version must be 2.3, 2.4, or 2.5"

The `in` operator

Check membership in a list:

- name: admit-discharge-transfer
  expression: msh.trigger in ["A01", "A02", "A03", "A04", "A08"]
  message: "Only ADT triggers A01-A04 and A08 are accepted"

- name: known-sender
  expression: msh.sending_fac in ["HOSP_A", "HOSP_B", "LAB_CENTRAL"]
  message: "Unknown sending facility"

Conditional logic

Rules can use && (and), || (or), and ! (not). A common pattern is “if condition A holds, then condition B must also hold”:

- name: inpatient-requires-location
  expression: pv1.patient_class != "I" || pv1.assigned_location != ""
  message: "Inpatient encounters (PV1-2=I) must have an assigned location (PV1-3)"

This reads: “either the patient is not an inpatient, or they have a location.” If the patient is an inpatient and has no location, both sides are false and the rule fails.

This is a standard pattern for conditional requirements. The expression A != X || B != "" is equivalent to “if A equals X, then B must not be empty.”

Safe field access

Direct field access on a missing key causes an evaluation error:

# Dangerous if PID segment is missing or country is empty
- name: finnish-only
  expression: pid.country == "FI"

If the message has no PID segment, pid is an empty map. Accessing pid.country on an empty map is a key-not-found error. The server returns AE (Application Error) instead of AR (Application Reject). The sender retries, gets the same AE, retries again. Wrong behavior.

The safe access operator ? with .orValue() handles missing keys:

- name: finnish-only
  expression: pid[?'country'].orValue("") == "FI"
  message: "Only Finnish patients accepted at this endpoint"

If country is missing, .orValue("") returns an empty string. The expression evaluates to "" == "FI", which is false. The server returns AR with a clear message. The sender knows not to retry.

Use ? and .orValue() whenever a field might be absent. The four segment maps (msh, pid, pv1, obx) are always present but may be empty if the corresponding segment is missing from the message.

OBX list operations

Messages often carry multiple OBX (observation) segments. Lab results, vital signs, and diagnostic reports typically have one OBX per observation value. The obx_list variable gives access to all of them.

Require all observations to have final status

- name: all-observations-final
  expression: obx_list.all(o, o.status == "F")
  message: "All OBX segments must have final status (OBX-11=F)"

.all() returns true only if every element in the list satisfies the condition. If any OBX has a status other than “F” (final), the rule fails.

Require at least one observation

- name: has-observations
  expression: obx_list.exists(o, o.status == "F")
  message: "At least one final OBX segment is required"

.exists() returns true if any element satisfies the condition. This is weaker than .all(). It passes even if some OBX segments have preliminary status, as long as at least one is final.

Check observation values

- name: observations-have-values
  expression: obx_list.all(o, o.value != "")
  message: "All OBX segments must have a value (OBX-5)"

Empty list behavior

If a message has no OBX segments, obx_list is an empty list. obx_list.all(...) on an empty list returns true (vacuous truth). obx_list.exists(...) on an empty list returns false.

This matters for rules that combine checks:

# Passes messages with no OBX segments (all of zero is true)
- name: all-final
  expression: obx_list.all(o, o.status == "F")

# Rejects messages with no OBX segments (exists on empty is false)
- name: has-final
  expression: obx_list.exists(o, o.status == "F")

If you want to require that OBX segments are present and all have final status, use both:

- name: has-observations
  expression: size(obx_list) > 0
  message: "At least one OBX segment is required"

- name: all-observations-final
  expression: obx_list.all(o, o.status == "F")
  message: "All OBX segments must have final status"

Rule order matters. The has-observations rule runs first. If it fails, the all-observations-final rule never evaluates.

How rules map to ACK codes

The validation result determines which acknowledgment code the sender receives:

What happened	ACK code	Sender action
All rules pass	AA	Move to next message
A rule evaluates to false	AR	Fix the message, don’t retry
A rule fails to evaluate	AE	Retry (receiver’s problem)
CEL syntax error at startup	(none)	Server won’t start

The distinction between AR and AE is why safe field access matters. A rule that errors because of a missing key returns AE, which tells the sender to retry. A rule that evaluates cleanly to false returns AR, which tells the sender the message is wrong. Use ? and .orValue() to keep validation failures in the AR category.

Rules are compiled when the server starts. A syntax error, a type mismatch, or a reference to an undeclared variable prevents startup. This means a bad rule fails the boot, not a message at 3am.

A complete example

A configuration for a lab results endpoint that accepts ORU messages from known senders, requires patient identification, and enforces final observation status:

rules:
  - name: require-control-id
    expression: msh.control_id != ""
    message: "MSH-10 (message control ID) is required"

  - name: oru-only
    expression: msh.msg_type == "ORU"
    message: "Only ORU messages accepted at this endpoint"

  - name: known-sender
    expression: msh.sending_fac in ["LAB_CENTRAL", "LAB_NORTH", "LAB_SOUTH"]
    message: "Unknown sending facility"

  - name: require-patient-id
    expression: pid.id != ""
    message: "PID-3.1 (patient ID) is required"

  - name: require-observations
    expression: size(obx_list) > 0
    message: "At least one OBX segment is required"

  - name: all-observations-final
    expression: obx_list.all(o, o.status == "F")
    message: "All OBX segments must have final status (OBX-11=F)"

Six rules, evaluated in order. A message missing a control ID is rejected before the message type is checked. An ORU from an unknown sender is rejected before the patient ID is checked. The sender gets a specific error message for the first rule that fails.

For background on the acknowledgment codes these rules produce, see HL7 ACK and NAK codes: AA, AR, AE explained. For the protocol layer beneath validation, see What is MLLP and how does it work.