Troubleshooting the MQ Queues

This week I’m going to cover a topic near and dear to my heart: MQ queuing. If your Millennium® system is sluggish, the MQ is an important area to investigate. Backlogs here indicate transactions are taking too long to process.

MQ has one of two uses within Millennium. The first is to be a storage location for messages carrying critical clinical or financial data that must be processed. Called Reliable Delivery Model (RDM) messages and stored in the RDM queues, they include orders, charges and other information that must not be lost if something happens to the node, network or other infrastructure component. Essentially, these queues take a transaction from a client’s Citrix server or PC and write the message to disk; then a Millennium application server (executable) picks up the message and processes it. When the processing is complete, the message is removed from the queue. The RDM queues are listed in Panther under the Queues control as CERN.<scp service="" name="">.</scp>

The second use of MQ is for non-persistent queuing, or request/reply (RR) queues. (In Millennium versions prior to 2007.18, the Shared Service Proxies handled this function, unless the MQ Shared Services feature was manually enabled.) CPM Script messages are an example of non-persistent messages. Unlike RDM messages, RR messages are not written to disk. If the application node crashes or becomes unreachable, RR transactions will simply be deleted from the queue. These queues are configured in pairs as CERN.SSREQ and CERN.SSREP queues. The SSREQ queue is where the Citrix server or PC places its messages. The SSREP queue is where the Citrix server or PC gets the data it requested.

Transactions backlogged in the RDM or CERN.SSREQ queues and waiting to be processed cannot be viewed by users. Lowering this queuing will speed up Millennium’s response time to clinicians and other users.

Let’s look briefly at the four areas that can have queues: RDM, RR for SSREQ, RR for SSREP and CERN.EXCEPTION.

1. To mitigate a backlog of messages in an RDM queue — CERN.CPMPROCESS, for instance — you need to add instances of CPM Process (SCP 55) to the node experiencing the issue. Assuming you are not out of CPU, memory/paging space or disk I/O, this action will increase the throughput of the queue. (This solution does not apply to single instance servers.)

2. To mitigate a backlog of messages in the RR for SSREQ — CERN.SSREQ.CPMSCRIPTASYNC, for instance — you need to add more instances of CPM Script Async (SCP 54) to the node experiencing the issue. Assuming you are not out of CPU, memory/paging space or disk I/O, this action will increase the throughput of the queue. (This solution does not apply for single instance servers.)

3. To mitigate a backlog of messages in the RR for SSREP — CERN.SSREP.CPMSCRIPT, for instance — you will need to do additional troubleshooting. These backlogs indicate the Citrix server, PC or other Millennium application executable is no longer running to retrieve its requested data from the queue. This is typically an indication that the Citrix servers or PCs are experiencing crashes in applications like PowerChart and FirstNet. Please follow the documented process from Cerner to identify and resolve these issues.

4. The CERN.EXCEPTION queue contains a list of transactions that could not be processed by Millennium. The information in these transactions, therefore, is not usable in Millennium applications. Messages go to this queue for one of the following reasons:

  • The destination queue is full.
  • The destination queue does not exist.
  • Message puts have been inhibited on the destination queue (an MQ queue configuration property).
  • The sender is not authorized to use the destination queue.
  • The message is too large for the destination queue.
  • The message contains a duplicate message sequence number.
  • The destination queue is too busy to accept the message.

When the issue is caused by a Millennium server that is too busy, you can often resolve the issue simply by requeuing the message after a few seconds. If requeuing does not resolve the issue, you will need to troubleshoot further to determine the appropriate resolution path.

Prognosis: Backlogs in MQ queues indicate a throughput or speed problem that can typically be resolved by increasing the application servers (executables) processing those messages. Resolving these issues is one step toward ensuring a reliable, fast-running Millennium domain.