A Healthy Dose of Doubt

“…and the only people I fear are those who never have doubts.” – Billy Joel, “Shades of Grey”

One of the most frustrating discoveries I’ve made about Millennium’s technical infrastructure is that tuning is never done. If I think I’ve gotten all of the performance out of Millennium that I can, I need to step back and re-evaluate my metrics…or just wait a little while for other issues to arise.

Another important realization, one that I learned a long time ago, is that my knowledge and experience is not the pinnacle of knowledge and experience. No matter how hard I try I can never know all there is to know on a given topic. There is always someone else who has dealt with a similar performance/stability/capacity issue and knows more about it than I do. When working with clients, I prefer a collaborative approach that comes with a little doubt that the problem solving will ever be done.  Performance is an iterative process that has to be monitored and adjusted through time, not just at a single point in time.

Millennium is a large, complex suite of applications with many inter-relationships and impacts that are not easily visible. Millennium requires DNS servers, networks, Citrix servers, Windows servers, Oracle, SQL Server (both for Citrix and some applications), MySQL, Oracle with raw logical volumes, Oracle with the Veritas File System, Oracle with ASM, AIX, HP-UX, VMS, thin clients, WebSphere application servers, WebSphereMQ, Apache web servers, load balancers, Java, Win32, ActiveX components. The list goes on and on. Plus, the Millennium environment is never static. There is always another project, another service package to install, another site to bring into production. Every change in the environment means changes in Millennium and affects how stable Millennium is and how well it performs.

All of these variables should help keep us humble. When I was in consulting and managing Millennium directly, I knew that if a client followed all of my recommendations, we could typically expect a 90- to 180-day period of stability. After that, change would happen. Ultimately, the earlier tuning would help us adapt more easily to these changes, but there would always be a pain period after a change in Millennium. The tuning was never done.

I work very hard to understand Millennium and share this knowledge with Millennium clients. In the process, I learn from each client I interact with. I’ve recently seen two extremes around this point: One organization benefited from this collaborative approach and from having a little doubt that they had done everything possible to tune Millennium, and another was certain they had done all the tuning they could.

The first client said that they just had the “normal” Millennium performance issues but asked me to evaluate the system and suggest ways to improve it. The Messaging Performance Audit revealed several straightforward, easy-to-resolve issues as well as others that required new Millennium front-end code. Here was a person willing to admit some doubt that his team was doing all it could to improve the hospital’s electronic medical record system. After implementing some of the recommendations, they saw quantifiable benefits.

On the flip side, a second client was experiencing significant performance issues in production after introducing key changes to their environment. The system basically came to a crawl every day from 9-11 a.m. and 1-3 p.m. Analysis revealed disk I/O issues. They had not implemented the baseline Cerner Millennium System Settings (CMSS). They had not tuned the operating system’s disk buffering memory regions. Response times from their internal SCSI drives were taking 20 milliseconds instead of three to eight. The Oracle database showed some large waits for disk writes to complete.

When I presented my findings and recommendations, the lead technical person dismissed them. There was no doubt in his mind that his team had done all they could to address the problems and that I was completely wrong. He was beyond adamant. His team continued to address the escalating problem simply by logging Millennium Service Requests until clinicians revolted. Now the client is paying Cerner and other outside consultants to come in and save the day. No doubt proper tuning would have cost a whole lot less and, more importantly, would not have impacted the clinicians’ ability to deliver patient care.

Prognosis: Being willing to doubt we have done everything possible to make Millennium as fast as it can be opens us to new ways of providing clinicians a more reliable, consistent EMR system.