While helping a client work through an issue the other day, I was looking at their system message log. They were troubled that more than 3,000 messages per hour were being written into their log file. After resolving their immediate issue, I did a little high-level analysis on what was being written into their system message log. The following graph shows the distribution of the last 29,000 records by event level: 0=error, 1=warning and 2=information. These 29,000 messages were logged during one eight-hour period.
As you can see, they had more than 27,000 error messages! I wanted to see what the errors were and was able to separate them by their event. The following graph shows this distribution.
The vast majority of the error messages – more than 18,000 – were “GetPatients_Step::perform.” What should the client do about all these errors? Absolutely nothing! This message is more like an audit level message than an error. As a matter of fact, it isn’t an error at all.
So how do you keep these messages from flooding your system log? Until the EMR vendor fixes the message level for this event, the only thing you can do is add it to your messages.suppress file and then cycle the server that is generating this event.
Also notice that this client had a second large group of error messages: more than 8,000 PREF_Exception errors (around 1,000 per hour). Because some preferences determine what clinical information is displayed to the clinicians, this error could be causing a clinical impact. If you are seeing PREF_Exception error messages at your site, the local person responsible for managing preferences should be able to resolve the issues with the error code and error message found in the data field of the message log event. If the local person responsible for managing preferences is unable to resolve the errors, an SR with the EMR vendor should be logged.
Prognosis: Analysis of error messages showed this site’s IT staff that it could improve overall system performance by addressing two top offenders. This simple troubleshooting reduced the number of error messages over an eight-hour period from 27,310 to 818.